[Hindi] PySpark Industry-Level Mastery / Databricks Certified Spark 3 Developer Training
About Course
- Course Consists on 500 Hands on Exercises and Two End to End Pyspark Project
- Live doubt sessions every weekend,
- Overall 100 hours of learning,
- 30 hours of Python training,
- Pyspark 500 hands-on exercises ,
- Two real time projects(LIVE sessions),
- Databricks certification training,
- Streaming data processing scenarios with : Kafka and Spark,
- Best industry practices for optimization for PySpark projects
Curriculum :
- Apache Spark Introduction
- Spark Internals | Spark Architecture In Depth Tour
- Python Foundations
- RDD Programming in PySpark
- SparkSQL Introduction
- Intense Hands on | SparkSQL DFs
- SparkSQL Advanced Concepts | Optimizations
- PySpark 500 Exercises
- Spark Streaming and Kafka
- All optimization techniques used in Industries for Spark Applications.
- Creating and Deploying spark Applications on AWS EMR
- End to End PySpark Project on AWS Cloud.
What Will You Learn?
- Deep Understanding of Apache Spark 3
- Understanding of Real time scenarios
- Certification and Industry Training
Course Content
Spark Introduction
-
Spark Introduction
01:00:03
Spark Architecture
-
Spark Achitecture Overview | Driver, Cluster Manager, Executor, RDD
01:14:37 -
RDD Operations | Types of Transformations | DAG
01:30:38 -
Spark Application | Job | Stages | tasks
01:48:23 -
Spark Architecture End to End
01:32:37 -
Spark Deploy Modes
01:32:37
PySpark Installation on Windows
-
PySpark Installation on windows
24:26
Pre Requisite 1 : The Python Adventure
-
Python | Pre requisites
03:58 -
Session 1 | Setup and Cloning Repo
01:03:19 -
Session 2 | python | Variables and Printing
53:55 -
Session 3 | Python | Formatting and Escape Sequences
29:22 -
Session 4 | Python | Decision making | if elif else
50:10 -
Session 5 | Python | List Tuple Set Dictionaries
51:53 -
Session 6 | Python | Loops
46:30 -
Session 7 | Python | Gaming using loops
33:51 -
Session 8 | Python | List comprehension | String slicing
45:57 -
Session 9 | Python | Slicing and step
01:00:55 -
Session 10 | String Manipulation
51:57 -
Session 11 | String Manipulation 2
01:14:27 -
Session 12 | list comprehension
56:52 -
Session 13 | Functions Introduction
01:05:47 -
Session 14 | Functions 2
01:09:21 -
Session 15 | lambda functions
01:00:30 -
Session 16 | File IO
01:08:12 -
Session 17 | Modules | Exception handling
02:19:51 -
Session 18 | Regular Expressions
01:00:27 -
Session 19 | RE | OOP Introduction
01:25:25 -
Session 20 | Class Instance Attributes
53:10 -
Session 21 | Class Methods vs Instance Methods vs Static Methods
02:10:18 -
Session 22 | Dunder methods | Operator overloading
01:04:14 -
Session 23 | Property Decorator
54:39 -
Session 24 | Encapsulation and Private attributes
54:39 -
Session 25 | Abstraction and Spark Introduction
01:47:44
RDD Operations
-
RDD Basics
01:03:08 -
RDD exercises Hands On | PySpark
01:31:10 -
RDD practice | PySpark
01:21:36 -
Broadcast Vraiables | PySpark
46:16 -
Accumulator and SparkSQL Introduction | PySpark
01:08:20
SparkSQL
-
Introduction to sparkSQL
04:21 -
DF vs DS | Catalyst Optimizer
01:17:00 -
Creating DataFrames from various Sources
01:10:40 -
SC vs SS | TempView vs GlobalTempView | PySpark
01:27:35 -
DF Structured Transformations | PySpark
29:55 -
Executor Memory Architecture
01:28:20
pyspark-practice-hands-on-200
-
Pyspark200 Setup
01:13:39 -
Creating DFs using textFiles
44:14 -
Creating DF using Binary file Formats
24:55 -
Creating DF doubts
01:03:34 -
Creating DF using mysql, s3
03:14 -
Select SelectExpr
35:06 -
Select Filter Intermediate
17:00 -
withColumn and withColumnRenamed
01:25:11 -
sort orderBy
34:25 -
groupBy and aggregate
32:01 -
join operations
35:59 -
set operations
39:59 -
Window Operations
01:19:09 -
Data Cleaning
54:26 -
distinct and dropDuplicates
15:26 -
pivot unpivot
37:37 -
UDF
35:21
PySpark DF Scenarios and Databricks Certification Practice
-
Session : 1
01:07:53 -
Session : 2
01:02:28 -
Session : 3
01:16:19 -
Session : 4
51:23 -
Session : 5
47:50 -
Session : 6
35:49 -
Session : 7
40:39 -
Session : 8
29:37 -
Practice Exercises and Solutions Document
00:00
Spark SQL Advanced
-
AQE | Cache VS Persist
01:48:43 -
Caching Doubts | Serialization Deserialization
53:10 -
Coalsesce vs Partition Scenarios
01:32:36 -
Resource Calculations for Spark Application | DRA
01:40:09 -
Garbage Collection Tuning
48:57
Certification Dump Discussion
PySpark 500
-
How to get access
00:00
Deploying Spark Applications on AWS EMR
-
Deploying Spark Application in Client Mode
58:05
End To End PySpark Project
Course Material
-
Presentation
00:00
Kafka Sessions
-
Kafka Installation
18:04 -
Session 1
48:44 -
Session 2
39:55 -
Session 3
15:00 -
Session 4
19:09 -
Session 5
01:04:18
Student Ratings & Reviews
No Review Yet