Intel unveils ‘Lunar Lake’ with LionCove & Skymont CPU Cores, Xe2 GPU and NPU 4

At Computex 2024, Intel CEO Pat Gelsinger shared details about the Lunar Lake client computing processor, highlighting its architecture, performance, and efficiency advancements.

Intel’s Lunar Lake Construction

The Lunar Lake System-on-Chip (SOC) has seven key components starting with the interposer package, which includes memory, a stiffener, and a Base Tile.

The Base Tile uses Foveros interconnect to integrate the compute tile and Platform Controller Tile. Unlike the Meteor Lake, Lunar Lake uses fewer tiles to enhance efficiency and reduce latency.

The compute tile is built on TSMC’s N3B process, while the Platform Controller Tile uses the TSMC N6 process.

Lunar Lake features on-package memory available in 16 GB and 32 GB LPDDR5X configurations, with speeds up to 8533 MT/s per chip. This memory setup supports a 16b x4 channel, reducing PHY power by 40% and saving 250 mm² in area compared to traditional PCB designs.

The chip has an 8-core hybrid design with four P-Cores and four E-Cores, supported by a new Thread Director. The P-Cores offer 2.5 MB of L2 cache each and up to 12 MB of shared L3 cache. The E-Cores provide 4 MB of shared L2 cache and double the vector and AI throughput.

The Xe2 GPU in Lunar Lake has 8 Xe cores, 8 Ray Tracing Units, XMX support, and 8 MB of dedicated cache. The SOC delivers 120 TOPS in total, with 48 TOPS from the NPU, 67 TOPS from the GPU, and 5 TOPS from the CPU.

Intel Lion Cove P-Core Architecture

Lion Cove is the “P” core in Lunar Lake, designed for high performance and efficiency.

It improves single-threaded performance with a new microarchitecture, offering a 14% IPC boost over Redwood Cove cores. The core achieves a 15% increase in performance per watt and a 10% improvement in performance per area.

Key features include an 8-wide allocation/rename unit, a 12-wide retirement unit, a 576-deep instruction window, and 18 execution ports. The memory subsystem has a 3-level cache hierarchy, including a 48 KB L0 cache, 192 KB L1 cache, and 2.5 MB L2 cache per core.

Intel Skymont E-Core Architecture

Skymont is the “E” core in Lunar Lake, focused on efficiency. It enhances workload coverage, vector and AI throughput, and scalability. Skymont achieves a 38% IPC improvement in integer tasks and a 68% improvement in floating-point tasks over Crestmont E-Cores.

The front-end features an 8-wide allocation and 16-wide retire unit, with a 416-entry out-of-order window. The vector performance is upgraded with a 4x 128-bit floating point pipeline and SIMD vector, improving AI capabilities.

The memory subsystem includes a 4 MB L2 cache per four-core cluster, double the bandwidth, and faster L1 to L1 transfers. Skymont E-Cores provide significant efficiency improvements, with up to 4x higher performance at peak power compared to Crestmont.

Updates to Power Management & Thread Director

Lunar Lake introduces a new Thread Director upgrade for better P-Core and E-Core utilization.

Enhanced algorithms and finer granularity in workload handling improve efficiency. New OS Containment Zones manage power and performance by scheduling work to specific core types.

The power management block within the SOC includes three profiles: Best Efficiency Mode, Balanced Mode, and Performant Mode.

These profiles adjust SOC frequency and scheduling to optimize power savings, achieving up to 35% power savings in applications like Microsoft Teams.

Intel Lunar Lake’s NPU

Lunar Lake’s NPU 4 offers significant AI processing improvements with 48 Peak TOPS, a 4.36x increase over Meteor Lake’s NPU.

NPU 4 has 12K MACs and 6 Neural Compute Engines, with a higher clock rate of 1.95 GHz. It provides 12x higher vector performance, 4x higher AI TOPS, and 2x higher IP bandwidth.

Lunar Lake’s IO and Connectivity

Lunar Lake includes updated connectivity with Wi-Fi 7 and Thunderbolt 4 support. It features up to 3 Thunderbolt 4 ports with 25% better speeds using Thunderbolt 5 SSDs.

The integrated Wi-Fi 7 solution offers a 28% smaller silicon size, 11 Gbps CNVio 3 interface, and improved reliability with Multi-Link Operation (MLO).

Security features include Intel SSE, GSC, CSME, and PSE engines, enhancing hardware security.

Availability

Intel plans to release over 80 designs from 20+ partners for Lunar Lake SOCs, with a Q3 2024 launch and broader availability in Q4 2024. An AI PC developer kit based on Lunar Lake will also be available, supporting future CPUs like Panther Lake.


Related Post