EC2 GPU Instances

EC2 GPU instances have been hugely popular due to advances in machine learning, gaming, etc. However, it can be difficult to know which to choose.

EC2 GPU Instances
Author:Emily Dunenfeld
Emily Dunenfeld

The huge rise in demand for GPU compute corresponds to the dramatic growth of compute-intensive applications utilizing AI/ML, blockchain, gaming, etc. Fortunately, for companies looking to harness GPU compute power, they can rent GPU from cloud providers like Amazon, instead of investing in expensive hardware upfront. Amazon has several EC2 instance options with GPU, categorized into the G and P families, which we will compare, mainly focusing on use case and price.

GPU Recap

GPUs (graphics processing units) are designed to handle parallel processing tasks. While CPUs are ideal for general-purpose tasks and management tasks, GPUs are ideal for compute-intensive tasks, such as machine learning (ML), data analytics, rendering graphics, and scientific simulations.

They are extremely popular and in demand because of the need for high-performance compute and the rise in complexity of the tasks just listed. As such, there have been issues with supply and demand imbalances. There are more options to choose from now, though keep in mind availability may still be limited and region-specific.

Amazon EC2 GPU Instances

Within the accelerated computing instances, two families consist of GPU-based instances, the G and P families. The P family was the first of the accelerated computing instances and was designed for general-purpose GPU compute tasks. The family has since evolved and has become widely adopted for ML workloads, with AI companies, like Anthropic and Cohere, using P family instances. The G family is optimized for graphics-intensive applications and has since also expanded its use cases to cover the ever-popular ML use cases.

Amazon EC2 P Family Instances

EC2 InstanceUse CasesGPUCPUNetwork Bandwidth (Gbps)vCPUs
P5Training and deploying demanding generative AI applications: question answering, code generation, video and image generation, speech recognition HPC applications at scale: pharmaceutical discovery, seismic analysis, weather forecasting, financial modeling8 NVIDIA H100 Tensor Core GPUs3rd Gen AMD EPYC processors (AMD EPYC 7R13)Up to 3,200192
P4ML training and deploying, HPC, computational fluid dynamics, computational finance, seismic analysis, speech recognition, autonomous vehicles, drug discovery8 NVIDIA A100 Tensor Core GPUs2nd Generation Intel Xeon Scalable processors (Cascade Lake P-8275CL)40096
P3Machine/Deep learning training and deploying, high performance computing, computational fluid dynamics, computational finance, seismic analysis, speech recognition, autonomous vehicles, drug discoveryUp to 8 NVIDIA V100 Tensor Core GPUsHigh frequency Intel Xeon Scalable Processor (Broadwell E5-2686 v4) or High frequency 2.5 GHz (base) Intel Xeon Scalable Processor (Skylake 8175)Up to 100Up to 96
P2General-purpose GPU compute ML, high performance databases, computational fluid dynamics, computational finance, seismic analysis, molecular modeling, genomics, renderingUp to 16 NVIDIA K80 GPUsHigh frequency Intel Xeon Scalable Processor (Broadwell E5-2686 v4)Up to 25Up to 64

EC2 P family information with Amazon recommended use cases (scroll to see full table)

Amazon EC2 G Family Instances

EC2 InstanceML Use CasesGraphics-Intensive Use CasesGPUCPUNetwork Bandwidth (Gbps)vCPUs
G6Training and deploying ML models for natural language processing: language translation, video and image analysis, speech recognition, personalizationCreating and rendering real-time, cinematic-quality graphics, game streamingUp to 8 NVIDIA L4 Tensor Core GPUs3rd generation AMD EPYC processors (AMD EPYC 7R13)Up to 100Up to 192
G5gML inference and deploying deep learning applicationsAndroid game streaming, graphics rendering, autonomous vehicle simulationsUp to 2 NVIDIA T4G Tensor Core GPUsAWS Graviton2 ProcessorUp to 25Up to 64
G5Training and inference deep learning models for simple to moderately complex ML use cases: natural language processing, computer vision, recommender systemsRemote workstations, video rendering, cloud gaming to produce high fidelity graphics in real timeUp to 8 NVIDIA A10G Tensor Core GPUs2nd generation AMD EPYC processors (AMD EPYC 7R32)Up to 100Up to 192
G4dnML inference and small-scale/entry-level ML training jobs: adding metadata to an image, object detection, recommender systems, automated speech recognition, language translationRemote graphics workstations, video transcoding, photo-realistic design, game streaming in the cloudUp to 8 NVIDIA T4 Tensor Core GPUs2nd Generation Intel Xeon Scalable Processors (Cascade Lake P-8259CL)Up to 100Up to 96
G4adN/ARemote graphics workstations, video transcoding, photo-realistic design, game streaming in the cloudAMD Radeon Pro V520 GPUs2nd Generation AMD EPYC Processors (AMD EPYC 7R32)Up to 25Up to 64
G3N/A3D visualizations, graphics-intensive remote workstation, 3D rendering, application streaming, video encodingNVIDIA Tesla M60 GPUs, each with 2048 parallel processing cores and 8 GiB of video memoryHigh frequency Intel Xeon Scalable Processors (Broadwell E5-2686 v4)Up to 25Up to 64

EC2 G family information with Amazon recommended use cases (scroll to see full table)

Choosing an EC2 GPU Instance for your ML Workload

For graphics workloads, the choice between the P and G family is usually much simpler—pick an instance in the G family. For ML workloads, more factors affect choice, the main being: use case, performance, instance size, and price. There are other factors too, such as availability and hardware compatibility, that will further narrow down the options.

ML Use Case (Training vs Inference vs Deploying) and Performance

The first thing to consider is the use case, whether it's training models, performing inference, or deploying pre-trained models. Certain instances are designed to handle these requirements better than others.

P family instances are generally much more powerful than comparable G family instances, making them an excellent choice for demanding ML tasks, such as large-scale model training or high-performance computing (HPC) workloads. Another obvious rule of thumb is that the later generations of an instance type tend to be more performant than the previous generations. So, if your use case requires the highest amounts of performance, consider P5 or P4 instances.

However, in many cases, such as for deploying pre-trained models or performing inference, you just don't need that level of compute. In those scenarios, the G5 or G4dn instances can be a more suitable and cost-effective choice.

Instance Size

The size of the instance, in terms of CPU and memory capacity, is another important consideration since it significantly impacts performance and cost-effectiveness. The G family offers a wider range of instance sizes, allowing you to choose the appropriate CPU and memory capacity based on your workload requirements. In contrast, the P family has fewer options; for example, the P5 and P4 series each only have one instance size available.

EC2 GPU Pricing

GPU InstanceOn-Demand Hourly CostvCPU
p5.48xlarge$98.32192
p4d.24xlarge$32.7796
p3dn.24xlarge$31.2196
p3.16xlarge$24.4864
g5.48xlarge$16.29192
p2.16xlarge$14.4064
g6.48xlarge$13.35192
p3.8xlarge$12.2432
g5.24xlarge$8.1496
g4dn.metal$7.8296
p2.8xlarge$7.2032
g6.24xlarge$6.6896
g5.12xlarge$5.6748
g6.12xlarge$4.6048
g3.16xlarge$4.5664
g4dn.16xlarge$4.3564
g5.16xlarge$4.1064
g4dn.12xlarge$3.9148
g4ad.16xlarge$3.4764
g6.16xlarge$3.4064
p3.2xlarge$3.068
g5g.16xlarge$2.7464
g5g.metal$2.7464
g5.8xlarge$2.4532
gr6.8xlarge$2.4532
g3.8xlarge$2.2832
g4dn.8xlarge$2.1832
g6.8xlarge$2.0132
g4ad.8xlarge$1.7332
g5.4xlarge$1.6216
gr6.4xlarge$1.5416
g5g.8xlarge$1.3732
g6.4xlarge$1.3216
g5.2xlarge$1.218
g4dn.4xlarge$1.2016
g3.4xlarge$1.1416
g5.xlarge$1.014
g6.2xlarge$0.988
p2.2xlarge$0.904
g4ad.4xlarge$0.8716
g5g.4xlarge$0.8316
g6.xlarge$0.804
g4dn.2xlarge$0.758
g3s.xlarge$0.754
g5g.2xlarge$0.568
g4ad.2xlarge$0.548
g4dn.xlarge$0.534
g5g.xlarge$0.424
g4ad.xlarge$0.384

EC2 GPU instances pricing US East (N Virginia) (scroll to see full table)

G family instances tend to be much more cost-effective than their P family counterparts, potentially resulting in significant cost savings for organizations that don't require the highest levels of GPU performance.

g4dn

The g4dn instance receives a lot of attention, and rightfully so. It is the lowest cost of the EC2 GPU instances and is performant for ML inference and small-scale training.

Conclusion

Selecting an EC2 GPU instance, though not as much of an investment as setting up and purchasing your own hardware, is still a big investment with lots of factors to consider. The two accelerated computing families with GPU instances, the P family and G family, both have several options to choose from. While the P family has instances that are better suited for demanding tasks like large-scale model training, G family instances have a good balance between performance and cost-effectiveness that is a good choice for many workloads.

Get started with a free trial.

Manage your teams today.

Sign up

TakeCtrlof YourCloud Costs

You've probably burned $0 just thinking about it. Time to act.