A new update to the Intel document for software developers indicates that the company will begin to introduce various AVX-512 instruction set extensions to its consumer CPUs soon. This will start from the codenamed Cannon Lake (CNL) and Ice Lake (ICL) processors, made using 10 nm process technologies. The new extensions will enable future chips to improve performance in certain applications. One of the main questions on AVX-512 is which consumer programs will actually support the AVX-512 when these CNL and ICL processors hit the market. In addition to the AVX-512, the upcoming processors will introduce a host of other new non-AVX-512 instructions.
According to the Intel Architecture Instruction Set Extensions and Future Features Programming Reference document, Intel’s Cannon Lake CPUs will support AVX512F, AVX512CD, AVX512DQ, AVX512BW, and AVX512VL. This will bring the feature set of these CPUs to the current level of the Skylake-SP based processors. In addition, the Cannon Lake microarchitecture will support the AVX512_IFMA and AVX512_VBMI commands, but at this point, it is unclear whether the support will be limited to servers, or will also be featured in the consumer processors (the latter scenario is likely based on the document wording, but remains unclear).
Intel originally promised to release Cannon Lake processors in 2016 – 2017 timeframe, but delayed introduction of its 10 nm process technology to 2018, thus postponing the CPU launch as well. Initially it was expected that the Cannon Lake CPUs would generally resemble the Kaby Lake and Coffee Lake chips with some refinements, but the addition of the AVX-512 support means a rather tangible architecture improvement. For AVX-512, large the chunks of data require massive memory bandwidth, which the Skylake-SP cores get due to large caches and more memory controllers. Keeping in mind memory bandwidth and power consumption factors, the AVX-512 might not be supported by all Cannon Lake client CPUs, but only by those aimed at higher-performance machines (i.e., no AVX-512 for ULP mobile parts as well as entry-level desktop SKUs, but this is a speculation at this point). Meanwhile, a good news is that by the time AVX-512-supporting Cannon Lake processors arrive, programs for client PCs that take advantage of the latest extensions will likely be available.
The evolution of the AVX-512 on general-purpose CPUs is not going to stop. Intel’s Ice Lake processors will support AVX512_VPOPCNTDQ (which will also be supported by the Xeon Phi ‘Knights Mill’) commands as well as AVX512_VNNI, AVX512_VBMI2, AVX512+VPCLMULQDQ and AVX512_BITALG instructions. The ICL chips will also feature AVX-512 versions of known AES and GFNI algorithms for encryption and error corrections — AVX512+VAES and AVX512+GFNI.
Meanwhile, the Knights Mill will exclusively support AVX512_4FMAPS and AVX512_4VNNI (at least for a while, because an Intel filing with the Linux kernel states that the upcoming Xeon Phi and Xeon CPUs will support both commands, but descriptions of Linux patches are not always accurate, plus, plans tend to change).
|AVX-512 Support Propogation by Various Intel CPUs|
|Xeon, Core X||General||Xeon Phi|
|Source: Intel Architecture Instruction Set Extensions and Future Features Programming Reference (pages 12 and 13)|
As it turns out from Intel’s document, the Cannon Lake and Ice Lake processors will have an up-to-date AVX-512 support. It is unknown whether the CNL and the ICL cores will be used inside the future server processors (remember that Intel has server-specific 'Cascade Lake' product incoming), but if this is the case, then it looks like Intel’s cores for server and client computers will have the same feature-set going forward, at least when it comes to the AVX-512 support.
Adding the AVX-512 to consumer processors looks like an important development even though the instruction set was primarily designed to process large amounts of data common for servers and, to a degree, workstations (such as encoding, rendering, cryptography, deep learning, etc.). Apparently, Intel believes that 512-bit INT/FP calculations will be important for mainstream PCs as well. A big question is how exactly Intel plans to implement the AVX-512 in various Cannon Lake and Ice Lake processors going forward. Keep in mind that Intel’s six and eight-core Skylake-X CPUs officially support one fused FMA for AVX-512-F, but the chips with 10+ cores officially support dual 512-bit AVX-512-F ports and can offer up to two times higher performance. So in that respect, there is potential for further differentiation between products.
In the meantime, Intel’s Cannon Lake and Ice Lake CPUs will have a number of other new instructions for various matters and they are certainly worth looking at.
In a bid to speed up certain cryptography algorithms, Cannon Lake will feature the SHA-NI instruction set that is already supported by the Goldmont cores. SHA-NI is of a similar base to AES-NI, that was added several generations prior. Based on Intel’s publications, SHA-NI can speed up SHA1, SHA256 and SHA224 algorithms. In addition, the new CPUs will also support the UMIP security mechanism that prevents the execution of certain instructions in if their privilege level is insufficient for that, preventing certain apps from accessing the OS settings.
The Ice Lake chips will bring support for Fast Short REP MOV instruction that will enable fast moves of large amounts of data from one location to another, which will benefit optimized memory-intensive applications. Keep in mind that we are moving towards persistent memory for a number of server applications and therefore large amounts of data located in DRAM and/or NVDIMMs will be more common in the future.
Another interesting feature supported by the Ice Lake consumer processors is CLWB (Cache Line Write Back) command for NVMe programming. The feature is already supported by the Skylake-SP cores and is required to better handle SSDs connected to the processor, but will come into consumer products with Ice Lake. CLWB flushes the write caches, but does not invalidate the data, making it available if it is needed after the line is flushed, thus improving performance in certain situations. Given the Purley/Skylake-SP context, CLWB is something required for upcoming NVDIMMs (based on 3D XPoint), but it is not completely clear how Intel expects to use it in case of consumer platforms (they make sense for certain workstation applications and for that reason CLWB is supported by SKL-SP). In any case, the addition of CLWB will add some speed in certain cases when very fast SSDs are used and cache miss is an issue.
There are other features coming in the Goldmont Plus (the heart of upcoming Gemini Lake SoCs) and Ice Lake processors, namely PTWRITE and RDPID, which seem to be aimed mostly at software developers and which purpose may not benefit end users right away.
|Instruction Set Extensions of Cannon Lake, Ice Lake and Goldmont+ CPUs|
|Cannon Lake||SHA-NI||Security||Cryptography acceleration.|
User-Mode Instruction Prevention
|Security||Prevents execution of certain instructions if the Current Privilege Level (CPL) is greater than 0. If these instructions were executed while in CPL > 0, user space applications could have access to system-wide settings such as the global and local descriptor tables, the task register and the interrupt descriptor table.|
|Performance||Writes back modified data of a cache line similar to CLFLUSHOPT, but avoids invalidating the line from the cache (and instead transitions the line to non-modified state). CLWB attempts to minimize the compulsory cache miss if the same data is accessed temporally after the line is flushed if the same data is accessed temporally after the line is flushed.|
|Fast Short REP MOV||Performance||Enables fast moves of data from one location to another.|
Read Processor ID
|General||Quickly reads processor ID to discover its feature set and apply optimizations/use specific code path if possible.|
Write Data to a Processor Trace Packet
|Source: Intel Architecture Instruction Set Extensions and Future Features Programming Reference (pages 12 and 13)|
Intel and AMD have been adding various instruction set extensions to the x86 architecture since the mid-1990s. Throughout the recent 20 years, both companies have brought in hundreds of new instructions designed to improve performance in various applications by SIMD instructions and feeding CPU cores large amounts of data at once or by using special-purpose hardware. Intel’s latest mainstream extensions are called the AVX/AVX2 and their main purposes were increasing the width of the register file (both SIMD and integer) to 256 bits and the introduction of commands like the FMA3 (that serves the same purpose — does relatively complex computations in one instruction). To perform 256-bit AVX2 operations, CPUs have to lower their frequency to maintain stability, as cores tend to draw a lot of power under such workloads, but even at lower clock rates AVX/AVX2 make a lot of sense and increase overall throughput.
The next step in the evolution of the instruction set extensions that Intel made was the AVX-512. With AVX-512 the company decided to introduce different sets of instructions for different applications and implemented them in different products. Some of the AVX-512 extensions are aimed primarily at enterprise workloads, whereas the others are needed for supercomputers or high performance compute. Implementing all of them in in all products hardly makes a lot of sense for Intel and its customers, so the latest Skylake-SP Xeons (and the high-end desktop processors) support one set of AVX-512 commands and the Xeon Phis support another one. In the meantime, contemporary mainstream consumer CPUs do not support AVX-512 at all. One of the reasons for this is because the physical implementation significantly increases die size (by up to 15% in case of the Skylake core). Other factors such as the cost associated with a die increase, and partly because client applications today cannot take advantage of such instructions, are also in the mix. In the future, this is going to change as Intel plans to enable support of certain AVX-512 variations in its future Cannon Lake and Ice Lake processors for mainstream consumers.
The addition of the AVX-512 to the future consumer CPUs is a good news for those who use such processors for things like video encoding, rendering or other applications that are common for workstations. Meanwhile, with the Ice Lake consumer chips, Intel is adding a deep learning-specific (AVX512_VNNI) 512-bit instructions as well as the NV-DIMM-oriented features such as CLWB, although immediate advantages for this market segment are unclear. Intel is opening this information up to allow developers to prepare for these processors and develop software in advance. In any case, all new features are always welcome by many because at some point they start to bring certain advantages.
Storage enclosures come in many varieties to target different market segments. They usually have one or more downstream SATA ports, with USB being a popular interface in the low-end and mid-range markets. Within the USB storage enclosure market, device vendors have multiple opportunities to tune their product design for specific use-cases. Today's review will take a look at HighPoint's RocketStor RS6114V, a 4-bay direct-attached storage enclosure backed up by their software RAID stack.
Not willing to be left behind at the starting line, Biostar has announced its entries into the rapidly growing Z370 motherboard market. At the time of publication, Biostar is bringing two boards to the table from their Racing line with the Z370GT7 and Z370GT6. The GT7 is the company's flagship board and, accordingly, will be the more expensive of the two. While both are full-featured motherboards, the GT7 offers an additional M.2 heatsink over the GT6 (bringing the total to two), three full-length PCIe slot reinforced compared to the GT6's two, as well as additional shrouding covering the audio section of the board on the GT7. Outside of that, differences between the boards will be difficult to spot.
The Biostar Racing line is in its third generation Racing series aesthetic which features a gold and black color theme. All heatsinks on the board are black and adorned with yellow accents while the PCB itself is black. The “R” (Racing) symbol is found prominently on the chipset heatsink. Other gold accents are found scattered around the board.
Both boards use an 11-phase VRM to drive the Coffee Lake-based CPUs. Both boards also feature three full-length PCIe x16 slots and three x1 slots. In the top right corner, both boards have a debug LED, BIOS switch, and a panel with four buttons on it for power/reset functionality, as well as Turbo and Eco mode buttons. RGB LEDs can be found on both boards with the GT7’s located on the back panel IO shroud, while the GT6’s are found on to the left of the audio section. The integrated LEDs and external LEDs (via two headers) can be controlled with Biostar’s Vivid LED DJ utility. It features 10 different flashing modes along with color, speed, and brightness controls allowing control over each lighting zone independently.
Neither board uses reinforced DIMM slots, however two full-length PCIe slots on the GT6 and all three slots on the GT7 get protection. The slots break down to x8/x8/x4 and both boards support 3-way AMD Crossfire, though it should be noted that there isn’t a mention of SLI support in any form on the specifications page. The last full-length slot at x4 shares bandwidth with the second M.2 slot. The boards four DIMM slots support up to 64GB of RAM with speeds supported to DDR4 3866(OC). While still fast, this is the second lowest speed we have seen supported across all Z370 boards covered. Only the ECS board supported slower maximum speeds (DDR4 3200).
For mass storage purposes, both boards use the full allotment of six chipset managed SATA ports. However instead of locating these in their typical position to the right of the PCH heatsink on the bottom half of the board, Biostar as placed them towards the middle of the board oriented them vertically. The first M.2 slot is above the top full-length PCIe slot and supports up to 80mm sticks, while the second slot can be found between the bottom two PCIe slots and supports up to 110mm devices. The GT7 offers heatsinks on both M.2 slots, while the GT6 only cools the bottom slot.
As for cooling, the board gives users a total of five four-pin fan headers to use scattered in various locations around the board. These can be controlled via voltage or PWM through the BIOS or through the Windows-based application. Audio functionality is handled by the Realtek ALC1220 codec, uses EMI shielding, what looks to be Chemicon audio caps, as well as separation from the rest of the board. Network capabilities on both boards and handled by the Intel I219-V Gigabit Ethernet which supports LAN surge protection.
Both the GT6 and GT7 have the same number and types of USB ports. There are two USB 3.1 (5 Gbps) Type-A ports and one Type-C, and an additional two USB 2.0 ports on the back panel IO. Internally there is an additional USB 3.1 (5 Gbps) header and USB 2.0 header for front panel connections. The back panel IO also contains a PS/2 port, DVI-D, and HDMI for video outputs, as well as a six plug audio stack. The GT7 chooses to use all black colored plugs versus the GT6 using the color-coded version most are familiar with.
|Biostar Z370GT6 & Z370GT7|
|Warranty Period||3 Years|
|Product Page||Z370GT6 / Z370GT7|
|Chipset||Intel Z370 Express|
|Memory Slots (DDR4)||Four DDR4
Support DDR4 3866(OC)
|Network Connectivity||1 x Intel I219-V LAN|
|Onboard Audio||Realtek ALC1220|
|PCIe Slots for Graphics (from CPU)||2 x PCIe 3.0 x16 slots @ x8
1 x PCIe 3.0 x16 slots @ x4
|PCIe Slots for Other (from Chipset)||3 x PCIe 3.0 x1 slots @ x1|
|Onboard SATA||6 x Supporting RAID 0/1/5/10|
|Onboard SATA Express||None|
|Onboard M.2||2 x PCIe 3.0 x4 - NVMe or SATA|
|USB 3.1||2 x Type-A (10 Gbps) Back Panel
1 x Type-C (10 Gbps) Back Panel
2 x Type-A (5 Gbps) Back Panel
2 x Header
|USB 2.0||2 x Ports Back Panel
2 x Header
|Power Connectors||1 x 24-pin EATX
1 x 8-pin ATX 12V
|Fan Headers||2 x CPU
3 x System
(PWM and DC Controlled)
|IO Panel||1 x PS.2 keyboard/mouse port
2 x USB 3.1 G2 ports
1 x USB 3.1 Type-C
2 x USB 3.1 Type-A
1 x HDMI
1 x DVI-D
2 x RJ-45 LAN Port
5 x Audio Jacks
Razer this week has announced that the company is upgrading its 13.3” Blade Stealth laptop with Intel’s new quad-core Core i7-8550U microprocessor, along with faster LPDDR3 memory. This marks the latest of several laptop vendors to capitalize on the launch of Intel's 8th Gen Core series of CPUs, integrating the new chips into their existing ultrabook designs.
Besides shipping with Intel's Core i7-8550U, the updated Razer Blade Stealth 13.3” also comes standard with 16 GB of LPDDR3-2133 memory, as well as a 512 GB PCIe 3.0 x4 SSD. The laptop also retains support for Thunderbolt 3 and eGFX, allowing the integrated Intel UHD Graphics 620 to be augmented with AMD Radeon and NVIDIA GeForce video cards in an eGFX chassis. In either scenario, the upgraded Blade Stealth has the same 13.3” IGZO panel with a 3200×1800 resolution (QHD+), 400 nits brightness and offering 100% sRGB color gamut coverage as the model released in June.
Coming off of the heels of Intel's dual-core Kaby Lake-U CPUs, the big draw for the new Kaby Lake Refresh-U CPUs is of course the additional two CPU cores. For moderately-to-heavily threaded workloads that can use more than two cores, these newer quad-core CPUs can offer a sizable boost in performance. Interestingly, Razer also claims that the new version of the laptop has a longer battery life, despite the fact that the battery size is unchanged. That said, Razer hasn't left the laptop's chassis completely untouched; the quad-core Blade Stealth is slightly thicker than the predecessor, adding another 0.7mm over its predecessor.
Otherwise when it comes to connectivity, the updated Blade Stealth 13.3” has all the same features as its predecessor does: a Killer Wireless AC 802.11ac + Bluetooth 4.1 module, an Intel Thunderbolt 3 controller supporting one USB Type-C port, two USB 3.0 connectors, an HDMI 2.0a display output, a 720p webcam, a TRRS audio port, an RGB-backlit Razer Chroma keyboard and so on. The system is equipped with the same 53.6 Wh lithium-ion polymer battery as the previous model, but Razer claims that the upgraded Blade Stealth can now last for 10 hours on one charge. In addition, the machine comes with a 65 W USB-C power adapter (up from 45 W for the earlier models) which hopefully means that it will also charge faster.
|Razer Blade Stealth Laptops: Fall 2017, Default Configurations|
1.8 GHz/4 GHz
8 MB LLC
2.7 GHz/3.5 GHz
4 MB LLC
|Graphics||Intel HD Graphics 620|
|Storage||512 GB SSD||256 GB SSD||512 GB/1 TB SSD|
|Wi-Fi||Killer 802.11ac Wi-Fi module|
|USB||2 × USB Type-A|
|Thunderbolt||1 × Thunderbolt 3 port (USB Type-C)|
|Other I/O||HDMI 2.0a, 720p webcam, TRRS connector for audio, speakers, microphone|
|Dimensions||Height||13.8 mm/0.54"||13.1 mm/0.52"|
|Battery Life||10 hours|
The new quad-core Razer Blade Stealth 13.3" comes in CNC-milled aluminum chassis in black or gunmetal gray finish, but the chassis is 0.7 mm/0.02" thicker than the chassis used for the dual-core Blade Stealth 13.3". The new system in its default configuration (see the table above) is available for $1,699 from RazerStore.com in the U.S., Canada, France, United Kingdom, and Germany. This is a bit higher than the price of the older dual-core version, but Razer does not offer the new model with a 256 GB SSD, so the new model has higher baseline specifications.
On that note, it should be pointed out that the new quad-core version of the laptop adds to the existing Stealth family, rather than replacing it wholesale. The company and its partners also offer previous-gen Blade Stealth 13.3”/QHD+ laptops: the entry-level Blade Stealth with a 256 GB SSD is now available for $1349.99, whereas the higher-end Blade Stealth with a 1 TB SSD can be obtained with $1699.
It is noteworthy that Razer is not upgrading the 12.5” version of the Blade Stealth that features a 4K UHD display, and it looks like this is a deliberate decision. The key feature of this notebook is its 4K UHD display that offers among the highest pixel density (for a laptop) in the industry, and along those lines the even smaller laptop isn't a great fit for the higher performing quad-core CPUs; at least not without some sacrifices to size or throttling.
G.Skill has launched a new series of memory module kits optimized for Intel’s new 8th Generation Core processors. The new DIMMs belong to G.Skill’s Trident Z and Trident Z RGB families and are guaranteed to operate at 3733 – 4600 MT/s data transfer rates when paired with Intel's Coffee Lake processors. Some of the modules need significantly increased voltages and thus require higher-end motherboards that can deliver “clean” power.
Getting right down to business, the fact that G.Skill even announced memory kits specifically for Coffee Lake got an eyebrow raise out of us. At first blush, it seemed like a marketing stunt, especially since they're using the same Samsung’s B-die chips that they've been using for some time now. But according to the company, Coffee Lake's memory controller behaves ever so slightly differently than Kaby Lake's when overclocked, necessitating the new modules.
Sure enough then, if we compare G.Skill's DDR4-4200 and DDR-4600 modules for the new Coffee Lake/Z370 and the Kaby Lake-X/X299 platforms, we will notice that the modules for Coffee Lake have looser tRAS sub-timings than the modules for Kaby Lake. From performance point of view, tRAS might not be a big deal, but it's an unexpected change; if anything we would have expected Coffee Lake to accept the same timings as Kaby Lake. There are a few possible reasons for this difference - not the least of which is the immature Z370 platform - however the more interesting options are that it's a product of the new manufacturing process, or possibly even a new memory controller entirely (especially seeing as how Coffee Lake doesn't support DDR3).
Otherwise, G.Skill's tinkering only seems to have been necessary for their fastest modules, as their lower-clocked enthusiast-class memory sticks are unchanged from earlier revisions. Conversely, since G.Skill has just loosened the timings of their new high-speed DIMMs, they should continue to work fine in other platforms.
|Evolution of Intel DDR4 Memory Controllers for Socketed CPUs|
|Haswell-E||Skylake||Broadwell-E||Kaby Lake||Skylake-X||Coffee Lake|
|Number of Channels||4||2||4||2||4||2|
|DIMMs per Channel||2|
|Voltages||1.2 V||1.2 V
|1.2 V||1.2 V
|1.2 V||1.2 V|
|Launch Timeframe||Q3 2014||Q3 2015||Q2 2016||Q1 2017||Q2 2017||Q4 2017|
Overall, G.Skill’s lineup of Coffee Lake-optimized DRAM kits consists of seven products featuring two or four 8 GB or 16 GB modules based on Samsung’s B-die chips. The rather broad family of Coffee Lake-optimized memory products is aimed at different classes of systems. The fastest DDR4-4400/4500/4600 DIMMs are only available in 8GB capacities and require 1.4 V, 1.45 V or even 1.5 V. G.Skill positions these modules for enthusiasts seeking maximum performance and not interested in maximizing DRAM content per box. G.Skill’s ‘mid-range’ kits for Coffee Lake run at DDR4-4000/4200, have 32 GB of capacity (16 GB DIMMs), and are designed for those who need high memory bandwidth along with a decent amount of RAM. Finally, there is a 64 GB DDR4-3733 kit for users who run memory-intensive applications.
Traditionally, all the Trident Z modules come with XMP 2.0 SPD profiles to simplify their setup on optimized platforms. In addition, the modules are equipped with G.Skill’s proprietary aluminum heat spreaders. Meanwhile, the Coffee Lake-optimized lineup from G.Skill also includes two Trident Z RGB options with programmable LED lighting.
|G.Skill's Trident Z Memory for Intel's Coffee Lake/Z370 Platform|
|DDR4-3733||CL17 19-19-39||1.35 V||4×16 GB||64 GB||Trident Z RGB||F4-3733C17Q-64GTZR|
|DDR4-4000||CL18 19-19-39||4×8 GB||32 GB||F4-4000C18Q-32GTZR|
|CL19 19-19-39||2×16 GB||Trident Z||F4-4000C19D-32GTZKK|
|DDR4-4200||CL19 21-21-41||1.4 V||4×8 GB||F4-4200C19Q-32GTZKK|
|DDR4-4266||CL19 23-23-43||4×8 GB||Trident Z RGB||F4-4266C19Q-32GTZR|
|DDR4-4400||CL19 19-19-39||2×8 GB||16 GB||Trident Z||F4-4400C19D-16GTZKK|
|DDR4-4500||CL19 19-19-39||1.45 V||F4-4500C19D-16GTZKK|
|DDR4-4600||CL19 25-25-45||1.5 V||F4-4600C19D-16GTZKK|
G.Skill has validated its new memory kits using Intel Z370-based motherboards from ASUS — the ROG Maximus X Hero, ROG Maximus X Apex and the ROG Maximus X Formula.
Finally, G.Skill plans to start selling the new Coffee Lake-optimized Trident Z and Trident Z RGB memory kits in November with the fastest Trident Z RGB DDR4-4266 arriving in December. The company traditionally does not touch upon MSRPs of its products in its announcements because DRAM prices tend to fluctuate. Meanwhile, since we are dealing with the latest products for a premium platform, expect appropriate prices.