Copied


NVIDIA GH200 Superchip Revolutionizes Apache Spark with Unprecedented Efficiency

Terrill Dicki   Aug 21, 2024 08:49 0 Min Read


As the growth of generative AI continues to surge, IT leaders are seeking ways to optimize data center resources. According to the NVIDIA Technical Blog, the newly introduced NVIDIA GH200 Grace Hopper Superchip offers a groundbreaking solution for Apache Spark users, promising substantial improvements in energy efficiency and node consolidation.

Tackling Legacy Bottlenecks in CPU-Based Apache Spark Systems

Apache Spark, a multi-language open-source system, has been instrumental in handling massive volumes of data across various industries. Despite its advantages, traditional CPU-based systems encounter significant limitations, leading to inefficiencies in data processing workflows.

Pioneering a New Era of Converged CPU-GPU Superchips

NVIDIA's GH200 Superchip addresses these limitations by integrating the Arm-based Grace CPU with the Hopper GPU architecture, connected via NVLink-C2C technology. This integration offers up to 900 GB/s bandwidth, significantly outpacing the standard PCIe Gen5 lanes found in traditional systems.

The GH200's architecture enables seamless memory sharing between CPU and GPU, eliminating the need for data transfers and thus accelerating Apache Spark workloads by up to 35x. For large clusters of over 1,500 nodes, this translates to a reduction of up to 22x in the number of nodes and annual energy savings of up to 14 GWh.

NVIDIA GH200 Sets New Highs in NDS Performance Benchmarks

Performance benchmarks using the NVIDIA Decision Support (NDS) benchmark revealed that running Apache Spark on GH200 is significantly faster compared to premium x86 CPUs. Specifically, executing 100+ TPC-DS SQL queries on a 10 TB dataset took only 6 minutes with GH200, versus 42 minutes on x86 CPUs.

Notable query accelerations include:

  • Query67: 36x speedup
  • Query14: 10x speedup
  • Query87: 9x speedup
  • Query59: 9x speedup
  • Query38: 8x speedup

Reducing Power Consumption and Cutting Energy Costs

The GH200's efficiency becomes even more apparent with larger datasets. For a 100 TB dataset, GH200 required only 40 minutes on a 16-node cluster, compared to the need for 344 CPUs to achieve the same results with traditional setups. This represents a 22x reduction in nodes and 12x in energy savings.

Exceptional SQL Acceleration and Price Performance

HEAVY.AI benchmarked GH200 against an 8x NVIDIA A100 PCIe-based instance, reporting a 5x speedup and 16x cost savings for a 100 TB dataset. On a larger 200 TB dataset, GH200 still outperformed with a 2x speedup and 6x cost savings.

“Our customers make data-driven, time-sensitive decisions that have a high impact on their business,” said Todd Mostak, CTO and co-founder of HEAVY.AI. “We’re excited about the new business insights and cost savings that GH200 will unlock for our customers.”

Get Started with Your GH200 Apache Spark Migration

Enterprises can leverage the RAPIDS Accelerator for Apache Spark to migrate workloads seamlessly to the GH200. This transition promises significant operational efficiencies, with GH200 already powering nine supercomputers globally and available through various cloud providers. For more details, visit the NVIDIA Technical Blog.


Read More
The Hong Kong Monetary Authority has issued a warning about a fraudulent website posing as OCBC Bank (Hong Kong) Limited, urging public vigilance.
BitMEX has changed the Mark Method for NILUSDTH25 and REDUSDTZ25 to Fair Price marking, effective March 25, 2025, enhancing price accuracy.
BitMEX introduces NILUSDT perpetual swaps, offering traders up to 50x leverage. This new listing enhances trading options on the platform.
Bitcoin remains vulnerable to downward pressure due to tight liquidity conditions and weak investor sentiment, with ETF outflows and cautious market behavior persisting.
Vodafone implements AI-driven solutions using LangChain and LangGraph to optimize data operations and improve performance metrics monitoring and information retrieval across its data centers.
BitMEX announces the introduction of NILUSDT perpetual swap listing, offering traders up to 50x leverage. The NIL token will be available for trading starting March 25, 2024.
Cronos (CRO) Labs has appointed Mirko Zhao as its new leader, succeeding Ken Timsit. Zhao aims to enhance the blockchain’s growth and community engagement.
Cronos (CRO) Labs announces Mirko Zhao as the new Head of Product and Engineering, succeeding Ken Timsit, to lead the blockchain ecosystem's innovative growth.