Advanced Spark Tuning

Apache Spark Application Performance Tuning Workshop

Summary

This three-day hands-on training course presents the concepts and architectures of Spark and the underlying data platform, providing students with the conceptual understanding necessary to diagnose and solve performance issues.

With this understanding of Spark internals and the underlying data platform, the course teaches students how to tune Spark application code and configuration. The course illustrates performance design best practices and pitfalls. Students are prepared to apply these patterns and anti-patterns to their own designs and code.

The course format emphasizes instructor-led demos of performance issues and techniques to address them, followed by hands-on exercises. Students explore these performance issues and techniques in an interactive notebook environment. Students take away from the course a practical, illustrative body of code.

Duration

3 Days

Prerequisites

This course is designed for software developers, engineers, and data scientists who develop Spark applications and need the information and techniques for tuning their code. Good working knowledge of Spark is a prerequisite. Spark examples and hands-on exercises are presented in Python and Scala. The ability to program in one of those languages is required. Basic familiarity with the Linux command line is assumed. Basic knowledge of SQL is helpful.

Upcoming Classes

Online

Instructor-led online training

Location Sep 2020 Oct 2020 Nov 2020 Dec 2020
Virtual Classroom, AMER (Cloudera) Oct 6 – Oct 8
Nov 23 – Nov 25
Virtual Classroom, EMEA (Cloudera) Nov 24 – Nov 26

Classes in bold are guaranteed to run!

Onsite Training

Request a quote for a private training session.

Request Quote

Public Training

Virtual Classroom, AMER (Cloudera)

Virtual Classroom, EMEA (Cloudera)


Don't see a date that works for you?

Request Class

Check out our FAQ page.