Self Paced - Fullstack Data Engineering - Azure | Databricks

About Course

Training for Complete Data Engineering course with Big Data Hadoop and Spark. The course focuses on various aspects of Big Data frameworks like Hadoop and Spark. We will be learning about many tools in the Hadoop ecosystem such as hive, sqoop, flume, spark, and Kafka.

Course Content:

Azure Data Engineering
AWS Data Engineering
DataBricks Data Engineering
6 End to End Projects
SparkStreaming
Python Programming
Apache Hadoop
Apache Hive
PySpark 500 Hands On Exercises
SparkSQL
Kafka
NoSQL

Course Content

M1 – Data Engineering Roles and Responsibilities and Challenges

Responsibilities And 10 Dimensions

01:40:47
What is big data | 5 V’s | role of RAM processor HDD

44:26
Distributed Storage and Distributed Processing

49:59

Starter Kit

M2 – Hadoop Ecosystem (HISTORY LESSONS)
To understand how data engineering practices have evolved, you may review the following legacy sessions. For a modern, industry-aligned learning path, I recommend the sequence below: SQL → Python → PySpark → PySpark Projects Before beginning this path, I also suggest covering Hadoop fundamentals up to the YARN architecture, as it provides helpful context for distributed processing. This sequence will give you a strong foundation for the upcoming modules. The legacy sessions are included for those working with older systems who may still find them useful.

WARNING IMPORTANT !!!!!!!!!

03:07
hadoop session 1 definition and components

01:16:35
hadoop session 2 namenode datanode secondary namenode

32:53
hadoop session 3 secondary namenode and high availability

01:28:53
hadoop session 4 secondary namenode HA and federation

01:07:41
hadoop session 5 file blocks and replication

01:23:17
hadoop session 6 rack awareness

01:09:46
hadoop session 7 yarn architecture part 1

01:09:03
hadoop session 8 yarn architecture part 2

01:09:50
Hadoop Session 9 – Necessary Setups and Discussion

01:30:26
Hadoop Session 8 – MR workflow [Deprecated]

01:27:24
Hadoop Session 9 – YARN and MR QnA and Revision | Safe Mode | Load Balancer

01:07:57
Hadoop Session 10 – MR | File Blocks vs Input Splits [Deprecated]

02:13:29
Hadoop Session 11 – MR Workflow Revision | WordCount Hands On [Deprecated]

02:14:08
Hadoop Session 12 – Combiner and Partitioner [Deprecated]

02:39:59
Hadoop Session 13 – Reduce side join 1 [Deprecated]

01:32:32
Hadoop Session 14 – Reduce side join 2 [Deprecated]

01:32:10
Hadoop Session 15 – Map Side Join [Deprecated]

01:36:28
[Optional] setting_up_single_node_hadoop_cluster

00:00
AWS EMR Session 1 – EC2 introduction

23:37
AWS EMR Session 2 – IAM Roles

24:27
AWS EMR Session 3 – Starting EMR Cluster | Connecting to EMR from your System

58:26
AWS EMR Session 4 – Accessing EMR web Uis

58:26
Hive Session 1 – Hive Introduction and Architecture

01:25:03
Hive Session 2 – Hive Basic Commands

01:27:23
Hive Session 3 – Internal vs External Tables

58:35
Hive Session 4 – Design level optimizations (Partitioning)

01:06:31
Hive Session 5 – Design level optimizations (Bucketing) | Logical Joins

01:15:13
Hive Session 6 – Bucketing Scenarios | Hive SerDe

01:06:59
Hive Session 7 – SerDe Correction

02:55
Hive Session 8 – Join strategies in Hive | MR

01:13:03
Hive Session 9 – Hive Project

22:29
Hive Session 10 – Join Optimizations

01:34:31
Hive Session 11 – Hive Transactional Tables

01:19:36
Hive Session 12 – Hive Transactional Tables Materialized

01:15:29
Hive Session 13 – CBO | Vectorization | Resource level optimization | Materialized Views

01:44:59
Hive Session 14 – Vectorization in Hive

12:48
Hive Session 15 – MSCK repair

06:09
Hive Session 16 – UDF, UDAF , UDTF

01:30:37
Hive Exercise Documents and Data Files

00:00
Hive Notes

00:00
Sqoop Session 1 – Sqoop Introduction

01:19:28
Sqoop Session 2 – Sqoop Incremental Import

01:21:12
Flume Session 1 – Flume Introduction

01:39:31
Flume Session 2 – Flume Configuration

02:11:07

M3 – Python for Pyspark

Setup For Python Exercises

17:34
Session 1 – Python Hello World

01:04:27
Session 2 – Understanding Variables in Python

01:30:53
Session 3 – Input, Escape Sequences & Value Passing in Python

01:41:41
Session 4 – User Inputs CLI Args

01:01:17
Session 5 – Python conditional statements session 1

01:24:01
Session 6 – Python Conditional Statements Session 2

01:10:37
Session 7 – Python List Indexing Slicing and Step

01:23:51
Session 8 – Python List Operations and Machine Test Questions

54:04
Session 9 – Python Tuple and Dictionaries

01:03:50
Session 10 – Python Sets and For loop Fundamentals

01:23:49
Session 11 – Python While loops

01:22:10
Session 12 – Simple Login App using Python

01:04:56
Session 13 – Python Loops Break and Continue

01:04:23
Session 14 – List Compression 1

01:31:08
Session 15 – List Comprehension 2

01:46:33
Session 16 – Python Functions 1

01:23:59
Session 17 – Python Functions 2

01:28:22
Session 18 – Python Functions 3

01:10:46
Session 19 – Python File IO

57:56
Session 20 – Python Regular Expressions 1

01:11:50
Session 21 – Python Regular Expressions 2

01:11:22
Session 22 – Python OOP Session 1

01:26:41
Session 23 – Python OOP Session 2 – Instance Variable vs Class Variable

01:12:01
Session 24 – python oop instance vs class vs static methods

01:04:20
session 25 – python oop inheritance

01:08:32
session 26 – python oop property decorator

01:13:34
session 27 – python private methods and private attributes

35:35
session 28 – python bstract base clas and abstract method

36:12
session 29 – python polymorphism and encapsulation

58:33
session 30 – python functional programming first class citizens

01:09:20
session 31 – python functional programming closures

37:53
session 32 – python functional programming closures revision

39:40
session 33 – python functional programming decorators

56:17
session 34 – python functional programming generators

01:04:24
session 35 – python functional programming generators

45:14
session 36 – python functional programming iterators

37:20
session 37 – python modules and name main

01:14:12
session 38 – python pckages init

34:19
session 39 – python exception handling

26:58

M4 – SQL for Data Engineering

session 1 – rdbms session introduction to dbms

47:34
session 2 – rdbms session introduction to mysql

30:28
session 3 – rdbms session ER modelling entity attribute types entity set entity types

53:24
session 4 – rdbms session election of a primary key

23:27
session 5 – rdbms session relationship all terminologies

01:04:21
session 6 – CRUD Basics

01:25:18
session 7- keys hands on

01:16:27
session 8 – purpose and defining the constraints

47:57
session 9 – foreign key options

50:15
session 10 – alter constraints

51:06
session 11 – select and filter session 1

01:15:52
session 12 – select and filter session 2

01:07:31
session 13 – group by 1

01:01:45
session 14 – group by 2

01:07:10
session 15 – group by 3

01:18:49
session 16 – order by 1

41:59
session 17 – order by 2

01:00:32
session 18 – order by 3

01:01:50
session 19 – case expression 1

01:17:19
session 20 – joins 1

01:38:36
session 21 – joins 2

50:19
session 22 – window functions session 1

01:13:42
rdbms session 23 window aggregate functions

01:09:30
rdbms session 24 window analytical functions and advance window functions

53:31
rdbms session 25 union unionall intersection except

01:02:26
rdbms session 26 subqueries session 1

01:24:37
rdbms session 27 subqueries session 2

40:33
rdbms session 28 CTE common table expression

01:26:26

M5 – Linux mastery

Session 01 | Linux (24-06-2025)

01:41:46
Session 02 | Linux (25-06-2025)

01:04:30
Session 03 | Linux (26-06-2025)

01:20:36
Session 04 | Linux(27-06-2025)

01:30:48
Session 05 | Linux(28-06-2025)

01:51:05
Session 06| Linux (29-06-2025)

01:50:19
Session 07 | Linux (01-07-2025)

01:29:16
Session 08| Linux (02-07-2025)

01:24:37
Session 9 | Linux (04-07-2025)

14:10
Session 10 | Linux (04-07-2025)

01:36:19
Session 11 | Linux (05-07-2025)

01:53:03
Session 12| Linux (06-07-2025)

02:01:48
Session 13 | Linux (07-07-2025)

01:50:02
Session 14 | Linux (08-07-2025)

02:04:15
Session 15 | Linux (10-07-2025)

01:34:44
Session 16 | Linux ( 11-07-2025)

01:07:45
Session 17 | Linux (29-07-2025)

01:34:44
Session 18 | Linux (30-07-2025)

01:22:27
Session 19 | Linux (31-07-2025)

01:29:07

M6 – PySpark Essentials For Data Engineering

[Overview] Spark Session 1 – Introduction

16:37
[ClassRec] Spark Session 1 – Introduction

37:34
[Overview] Spark Session 2 – Spark Cluster vs Application Architecture

12:07
[ClassRec] Spark Session 2 – Spark Cluster vs Application Architecture

47:42
[Overview] Spark Session 3 – RDD Terminologies and Features

47:20
[ClassRec] Spark Session 3 – RDD Terminologies and Features

52:01
[Overview] Spark Session 4 – App vs Job vs Stage vs Task

33:41
[ClassRec] Spark Session 4 – App vs Job vs Stage vs Task

49:23
[Overview] Spark Session 5 – Spark Cluster vs Client Mode

12:47
[ClassRec] Spark Session 5 – Spark Cluster vs Client Mode

35:05
[Overview] Spark Session 6 – Spark Architecture

39:24
[ClassRec] Spark Session 6 – Spark Architecture

48:59
[Overview] Spark Session 7 – Spark Distrubuted Shared Variables

43:39
[ClassRec] Spark Session 7 – Spark Distrubuted Shared Variables

36:21
[Overview] Spark Session 8 – SparkSQL Introduction – RDD vs DF vs DS

40:58
[ClassRec] Spark Session 8 – SparkSQL Introduction – RDD vs DF vs DS

45:47
[Overview] Spark Session 9 – Spark Catalyst Optimizer

26:01
[ClassRec] Spark Session 9 – Spark Catalyst Optimizer

10:06
[Overview] Spark Session 10 – SparkContext vs SpakSession

39:32
[ClassRec] Spark Session 10 – SparkContext vs SpakSession

46:56
[Installation] Spark 3.5 Installation

20:35
[Overview] Spark Session 11 – Setup For Exercises

12:55
[ClassRec] Spark Session 12 : Ways to Create RDDs

16:34
[ClassRec] Spark Session 13 – RDD Creations Practice and Good Practices

16:34
[Overview] Spark Session 14 – map, mapPartitions, mapPartitionsWithIndex, glom

35:56
[ClassRec] Spark Session 14 – map, mapPartitions, mapPartitionsWithIndex, glom

27:08
[ClassRec] Spark Session 15 – map vs flatMap

13:11
[ClassRec] Spark Session 16 – groupByKey vs reduceByKey

40:50
[Overview] Spark Session 17 – Creating DFs from CSV files

38:11
[ClassRec] Spark Session 17 – Creating DFs from CSV files

27:18
[Overview] Spark Session 18 – Creating DF From JSON and XML Files

07:08
[ClassRec] Spark Session 18 – Creating DF from JSON, nested JSON, MultiChar and Custom Delimiter

22:15
[Overview] Spark Session 19 – Creating DFs from Binary files

11:42
[ClassRec] Spark Session 19 – Creating DFs from Binary files

33:36
[Overview] Spark Session 20 – Referring Columns, select, selectExpr, filter

15:50
[ClassRec] Spark Session 20 – Referring Columns, select, selectExpr, filter

17:28
[ClassRec] Spark Session 21 – sort / orderBy

17:38
[Overview] Spark Session 22 – groupBy and Aggregations

21:07
[ClassRec] Spark Session 22 – groupBy and Aggregations

19:23
[Overview] Spark Session 23 – Joins (inner, outer, left, right, left semi, left anti, cross, self)

18:58
[ClassRec] Spark Session 23 – Joins (inner, outer, left, right, left semi, left anti, cross, self)

39:20
[ClassRec] Spark Session 24 – Joins Revision

20:15
[Overview] Spark Session 25 – Window Functions | Ranking Functions

27:34
[ClassRec] Spark Session 25 – Window Functions | Ranking Functions

32:13
[Overview] Spark Session 26 – Window Analytical and Aggregate Functions

29:07
[ClassRec] Spark Session 26 – Window Aggregate Functions

21:56
[ClassRec] Spark Session 27 – Window Analytical Functions

17:10
[Overview] Spark Session 28 – Dealing With NULL Values

16:17
[Overview] Spark Session 29 – Dealing With Duplicate Records

05:18
[Overview] Spark Session 30 – Pivot and UnPivot

32:15
[Overview] Spark Session 31 – UDFs in PySpark

23:39
[ClassRec] Spark Session 31 – UDFs in Spark

26:28
Session 01 – Introduction to Apache Spark: Cluster View and Application View

01:11:11
Session 02 – RDD Fundamentals – Part 1

50:48
Session 03 – RDD Fundamentals – Part 2

01:26:11
Session 04 – Spark Application Execution: Jobs and Stages

01:29:46

M7 – Spark Advanced – Optimization Techniques – Industry Scenarios

[Overview] Spark Session 32 – Cache vs Persist

55:26
[ClassRec] Spark Session 32 – Cache vs Persist

22:32
[ClassRec] Spark Session 32 – Cache vs Persist S2

01:05:28
[Overview] Spark Session 33 – Executom Memory Architecture

32:36
[ClassRec] Spark Session 33 – Executor Memory Architecture S1

55:16
[ClassRec] Spark Session 33 – Executor Memory Architecture S2

01:12:00
[Overview] Spark Session 34 – Adaptive Query Execution

45:30
[ClassRec] Spark Session 34 – Adaptive Query Execution

01:10:32
[Overview] Spark Session 35 – Join Strategies in PySpark

52:34
[ClassRec] Spark Session 35 – Join Strategies – Broadcast Join

50:52
[ClassRec] Spark Session 35 – Join Strategies – Shuffle Hash Join

01:02:44
[ClassRec] Spark Session 35 – Join Strategies – Sort Merge Join and More

37:27
[ClassRec] Spark Session 36 – Resource Calculations For Spark Applications

01:13:12
[ClassRec] Spark Session 37 – Dynamic Resource Allocation

30:59
[ClassRec] Spark Session 38 – Garbage Collection Tuning

01:04:26
[ClassRec] Spark Session 39 – Handling Data Skew S1

44:36
[Overview] Spark Session 40 – Controlling Prallelism For Spark Applications

42:53
[ClassRec] Spark Session 40 – Controlling Parallelism For Spark Applications

56:38
[ClassRec] Spark Session 41 – Handling Data Skew S2

40:07
[ClassRec] Spark Session 42- Design Level Optimizations

45:08
[ClassRec] Spark Session 43 – Out Of Memory Error – Speculative Execution – DPP

49:20

M8 – Full Stack Data Engineering using Azure Databricks

DBX_001_2025-12-15_082818_S01_workspace-resource-groups-managed-resource-group.mp4

48:59
DBX_002_2025-12-16_083118_S02_networking-fundamentals-1.mp4

01:11:09
DBX_003_2025-12-17_084919_S03_networking-2.mp4

55:44
DBX_004_2025-12-19_083447_S03_deploy-databricks.mp4

34:51
DBX_006_2025-12-20_084710_S05_workspace-basics.mp4

48:59
AVOID unexpected BILLS

05:39
DBX_007_2025-12-21_085139_S06_dbutils-storage-local-vs-dbfs.mp4

54:52
DBX_008_2025-12-22_084744_S07_hands-on-formalities.mp4

01:00:07
DBX_009_2025-12-23_084954_S08_(FLOP)_blob-vs-adls-dbfs-vs-volumes.mp4

57:25
DBX_010_2025-12-25_084527_S09_azure-storage-fundamentals.mp4

01:04:29
DBX_011_2025-12-26_084003_S10_storage-options-hands-on.mp4

54:53
DBX_012_2025-12-27_084545_S11_unity-catalog-overview.mp4

57:31
DBX_013_2025-12-29_083503_S12_sp-accessing-adls-from-databricks.mp4

01:01:53
DBX_014_2025-12-30_081837_S13_mi-accessing-adls-from-databricks.mp4

40:29
DBX_015_2025-12-31_080348_S14_ak-sp-mi-process-adls-access.mp4

54:10
DBX_016_2026-01-01_082113_S15_project-incremental-ingestion-pipeline-1.mp4

01:04:52
DBX_017_2026-01-02_082355_S15_project-incremental-ingestion-pipeline-2.mp4

01:11:03
DBX_018_2026-01-03_082101_S15_project-incremental-ingestion-pipeline-3.mp4

01:09:50
DBX_019_2026-01-05_081929_S16_project-scd-type-1-implement.mp4

30:32
DBX_020_2026-01-06_081429_S17_project-scd-type-2-implementation.mp4

01:13:04
DBX_021_2026-01-09_082106_S18_unity-catalog-purpose.mp4

58:20
DBX_022_2026-01-10_082743_S19_data-cleaning-standardization-terminologies.mp4

01:10:27
DBX_023_2026-01-12_082017_S20_uc-managed-vs-non-managed-tables.mp4

59:55
DBX_024_2026-01-13_082148_S21_delta-tables-overview.mp4

01:27:41
DBX_025_2026-01-14_083615_S22_uc-essentials-catalog-schema-table.mp4

47:32
DBX_026_2026-01-16_083754_S23_s1-managed-and-non-managed-tables.mp4

01:06:02
DBX_027_2026-01-17_000000_PRJ_databricks-azure-project_B12_S13.mp4

01:01:08
DBX_028_2026-01-19_082047_S25_delta-tables-overview.mp4

52:45
DBX_029_2026-01-20_083429_S26_delta-table-anatomy-1.mp4

01:19:54
DBX_030_2026-01-21_084023_S27_delta-tables-mvcc-si-occ.mp4

43:15
DBX_031_2026-01-22_083850_S28_optimize-and-vacuum_01.mp4

01:09:16
DBX_033_2026-01-23_083143_S29_time-travel-and-cloning.mp4

01:04:54
DBX_034_2026-01-26_125446_S30_partitioning.mp4

01:02:06
Data Modelling SCD types

50:40
ISSUE : Databricks Stuck In Deleting State

11:54

M9 – Kafka Essentials For Data Engineering

kafka session 1 purpose and place of kafka in data engineering

53:31
kafka session 2 a big picture kafka Kraft vs zookeeper mode

01:12:10
kafka session 3 install kafka 4 in kraft mode in windows and ubuntu

21:22
kafka session 4 producer leader follower replias minimum isr

01:31:56
kafka session 5 partition placement strategy in kafka

41:54
kafka session 6 kafka topic internals

01:12:58
kafka session 7 consumer and consumer groups

22:15
kafka session 8 understanding ython producer

01:01:31
kafka session 9 understanding python consumer

46:31
kafka session 10 kafka and pyspark

53:49

M10 – Spark Streaming

Azure Data Engineering Complete Course

AWS Data Engineering Complete Course

Big Data With AWS || Demo Session

36:43
AWS IAM User, Group and Policies

01:26:35
Big Data With AWS | IAM Roles

01:23:47
AWS Cloud Infrastructure

31:39
Big Data With AWS | S3 Session 1

01:40:25
S3 Session 2

01:00:49
S3 Session 3

01:00:45
AWS glue | Crawlers And Jobs

01:23:58
Glue Scenarios | Glue Workflows

01:24:01
Glue Scenarios

01:15:03
EMR basics EC2 introduction

23:36
EMR_basics_IAM_Role

24:26
Starting EMR cluster | Connecting to EMR Cluster

58:25
AWS EMR | Starting and Deploying Spark Application on EMR

58:04
AWS EMR | Cluster Mode Deployment | Accessing Web UIs | Steps Introduction

58:48
BDA | Deploying Spark Application in Cluster Mode on EMR

11:15
Deploying Spark Application using Steps on AWS EMR

01:02:03
Athena Basics

01:23:48
Athena on Command Line

59:43
Using Athena through python code

41:27
Redshift And Data Warehousing Introduction

41:34
Redshift Clusters | Snapshots | S3 Copy

59:16
Creating redshift cluster and making it publicly accessible

13:58
Redshift connect using sql workbench and python script

38:55
dist keys and sort keys in redshift

24:30
DIST Keys hands on

30:38
Redshift Federated Queries

40:58
What is Streaming Data | Streaming Data Terminologies

23:24
Kinesis Data Streams | Kinesis Architecture and terminologies

40:05
Streaming Data using Console Producer | Python Producer and Python Consumer

39:48

MongoDB NoSQL For Data Engineering

Airflow for Data Engineers

DevOps in DE | Version Control System Essentials

CI / CD for data Engineering Pipelines

Course End Projects | Live Projects

Course Material

Student Ratings & Reviews

No Review Yet

About Course

Course Content:

What Will You Learn?

Course Content

M1 – Data Engineering Roles and Responsibilities and Challenges

Responsibilities And 10 Dimensions

What is big data | 5 V’s | role of RAM processor HDD

Distributed Storage and Distributed Processing

Starter Kit

Support and Contact Guide

Course Materials Access Guide

Study Roadmap Access Guide

hadoop session 5 file blocks and replication

WARNING IMPORTANT !!!!!!!!!

hadoop session 1 definition and components

hadoop session 2 namenode datanode secondary namenode

hadoop session 3 secondary namenode and high availability

hadoop session 4 secondary namenode HA and federation

hadoop session 5 file blocks and replication

hadoop session 6 rack awareness

hadoop session 7 yarn architecture part 1

hadoop session 8 yarn architecture part 2

Hadoop Session 9 – Necessary Setups and Discussion

Hadoop Session 8 – MR workflow [Deprecated]

Hadoop Session 9 – YARN and MR QnA and Revision | Safe Mode | Load Balancer

Hadoop Session 10 – MR | File Blocks vs Input Splits [Deprecated]

Hadoop Session 11 – MR Workflow Revision | WordCount Hands On [Deprecated]

Hadoop Session 12 – Combiner and Partitioner [Deprecated]

Hadoop Session 13 – Reduce side join 1 [Deprecated]

Hadoop Session 14 – Reduce side join 2 [Deprecated]

Hadoop Session 15 – Map Side Join [Deprecated]

[Optional] setting_up_single_node_hadoop_cluster

AWS EMR Session 1 – EC2 introduction

AWS EMR Session 2 – IAM Roles

AWS EMR Session 3 – Starting EMR Cluster | Connecting to EMR from your System

AWS EMR Session 4 – Accessing EMR web Uis

Hive Session 1 – Hive Introduction and Architecture

Hive Session 2 – Hive Basic Commands

Hive Session 3 – Internal vs External Tables

Hive Session 4 – Design level optimizations (Partitioning)

Hive Session 5 – Design level optimizations (Bucketing) | Logical Joins

Hive Session 6 – Bucketing Scenarios | Hive SerDe

Hive Session 7 – SerDe Correction

Hive Session 8 – Join strategies in Hive | MR

Hive Session 9 – Hive Project

Hive Session 10 – Join Optimizations

Hive Session 11 – Hive Transactional Tables

Hive Session 12 – Hive Transactional Tables Materialized

Hive Session 13 – CBO | Vectorization | Resource level optimization | Materialized Views

Hive Session 14 – Vectorization in Hive

Hive Session 15 – MSCK repair

Hive Session 16 – UDF, UDAF , UDTF

Hive Exercise Documents and Data Files

Hive Notes

Sqoop Session 1 – Sqoop Introduction

Sqoop Session 2 – Sqoop Incremental Import

Flume Session 1 – Flume Introduction

Flume Session 2 – Flume Configuration

M3 – Python for Pyspark

Setup For Python Exercises

Session 1 – Python Hello World

Session 2 – Understanding Variables in Python

Session 3 – Input, Escape Sequences & Value Passing in Python

Session 4 – User Inputs CLI Args

Session 5 – Python conditional statements session 1

Session 6 – Python Conditional Statements Session 2

Session 7 – Python List Indexing Slicing and Step

Session 8 – Python List Operations and Machine Test Questions

Session 9 – Python Tuple and Dictionaries

Session 10 – Python Sets and For loop Fundamentals

Session 11 – Python While loops

Session 12 – Simple Login App using Python

Session 13 – Python Loops Break and Continue

Session 14 – List Compression 1

Session 15 – List Comprehension 2

Session 16 – Python Functions 1

Session 17 – Python Functions 2

Session 18 – Python Functions 3

Session 19 – Python File IO

Session 20 – Python Regular Expressions 1