Award winning Data Science with Python Masterclass Training met toegang tot een online mentor via chats of e-mail, eindexamenbeoordeling en Practice Labs.
Lees meer.
Volume voordeel
No discount
1 Piece
€1.208,79€999,00
2% Korting
2 Pieces
€1.184,61€979,02/ Stuk
3% Korting
3 Pieces
€1.172,53€969,03/ Stuk
4% Korting
4 Pieces
€1.160,44€959,04/ Stuk
5% Korting
5 Pieces
€1.148,35€949,05/ Stuk
10% Korting
10 Pieces
€1.087,91€899,10/ Stuk
15% Korting
25 Pieces
€1.027,47€849,15/ Stuk
20% Korting
50 Pieces
€967,03€799,20/ Stuk
Maak een keuze
Officieel examen Online of fysiek
Start nu – bekroonde e-learning Inclusief proefexamens & 24/7
ISO 9001 & 27001 werkwijze 1000+ organisaties gingen u voor
Maatwerk & gratis intake Inclusief nulmeting bij training
Productomschrijving
Data Science with Python Masterclass Training
Word een datagedreven expert met meer dan 120 uur aan diepgaande online training!
Waarom kiezen voor deze opleiding?
Data is het nieuwe goud – maar alleen als je weet hoe je het moet analyseren, opschonen, integreren en visualiseren. De Data Science with Python Masterclass Training biedt je de complete reis van beginner tot gevorderde dataspecialist.
Met meer dan 120 uur aan interactieve content krijg je een solide basis in data-analyse, statistiek, gegevensarchitectuur en programmeervaardigheden met Python en R. Je leert hoe je ruwe en onsamenhangende data omzet in waardevolle inzichten, waarbij je gebruik maakt van moderne tools als Spark en Hadoop.
Daarnaast leer je hoe je data structureert volgens compliance- en governance-richtlijnen, en hoe je krachtige visualisaties maakt voor onderbouwde besluitvorming.
✔️ 1 jaar 24/7 toegang tot de training ✔️ Interactieve video’s, praktische opdrachten en quizzen ✔️ Inclusief certificaat van deelname ✔️ Leer in je eigen tempo, waar en wanneer je wilt
De training is onderverdeeld in vier gespecialiseerde leerroutes:
Track 1: Data Analyst
Track 2: Data Wrangler
Track 3: Data Ops
Track 4: Data Scientist
Wie zou moeten deelnemen?
Deze masterclass is geschikt voor:
Professionals die willen starten in data-analyse en data science
Analisten die willen werken met Python, R, Spark en Hadoop
IT’ers die hun data- en programmeervaardigheden willen verdiepen
Business Intelligence-specialisten die hun analyses willen versterken
Iedereen die data wil omzetten in slimme zakelijke beslissingen
Dit leertraject, met meer dan 120 uur online content, is onderverdeeld in de volgende vier tracks:
Demo Data Science with Python Masterclass Training
Cursusinhoud
Data Science Track 1: Data Analyst
In this track, the focus is the data analyst role with a focus on: Python, R, architecture, statistics, and Spark. Content: E-learning courses
Data Architecture Primer
Course: 1 Hour, 4 Minutes
Course Overview
Data Defined
Data Privacy
The Data Lifecycle
SQL vs. NoSQL
Create an Entity Relationship Diagram
Implement a SQL Solution
Implement a NoSQL Solution
Big Data
Data Architecture and Governance
IT Data System Architecture Types
Data Analytics and Reporting
Exercise: Implement Data Architecture Best Practices
Data Engineering Fundamentals
Course: 46 Minutes
Course Overview
Overview of Distributed Systems
Batch vs. In-Memory Processing
NoSQL Stores
Tools for Data Management
What is ETL?
ETL with Talend Open Studio
Data Modeling
AI and Machine Learning
Data Partitioning
Data Engineering
Data Reporting
Exercise: Create a Data Model
Python for Data Science: Introduction to NumPy for Multi-dimentional Data
Course: 1 Hour
Course Overview
Introduction to NumPy and the NumPy Ecosystem
Array Creation - Part 1
Array Creation - Part 2
Printing Arrays
Basic Array Operations
Universal Functions
Indexing and Slicing
Iterating Over Arrays
Reshaping Arrays
Exercise: Python NumPy Array Operations
Python for Data Science: Advanced Operations with NumPy Arrays
Reshape Dataframes Using Stack and Melt Operations
Exercise: Pandas for Basic Tabular Data Manipulation
Python for Data Science: Manipulating and Analyzing Data in Pandas DataFrames
Course: 45 Minutes
Course Overview
Iterating Over the Contents of a DataFrame
Exporting a DataFrame
Sorting
Handling Missing Data
Grouping with a Multi-Index
Merging DataFrames
Applying Join Operations on DataFrames
Pandas and Relational Databases
Exercise: Pandas for Advanced Data Manipulation
R for Data Science: Data Structures
Course: 52 Minutes
Course Overview
Creating Vectors
Manipulating Vectors
Sorting Vectors
Using Lists
Creating Matrices
Matrix Operations
Creating Factors
Creating Data Frames
Data Frame Operations
Exercise: Creating and Using a Data Frame
R for Data Science: Importing and Exporting Data
Course: 34 Minutes
Course Overview
Reading from CSV
Reading from Excel
Reading from HTML
Exporting to CSV
Exporting to Excel
Exporting to HTML
Exercise: Reading and Writing Data
R for Data Science: Data Exploration
Course: 41 Minutes
Course Overview
Creating dplyr Tables
Selecting Subsets
Filtering Tabular Data
Piping Data
Mutating Data
Summarizing Data
Combining Datasets
Grouping Data
Exercise: Querying Data
R for Data Science: Regression Methods
Course: 37 Minutes
Course Overview
Linear Data Preparation
Creating Linear Models
Interpreting Model Output
Using Linear Prediction
Logistic Data Preparation
Using glm
Exercise: Creating a Linear Model
R for Data Science: Classification & Clustering
Course: 39 Minutes
Course Overview
Preparing Data for Classification
Using rpart
Using ctree
Preparing Data for Clustering
Using K-Means Clustering
Using Hierarchical Clustering
Exercise: Creating a Decision Tree
Data Science Statistics: Simple Descriptive Statistics
Course: 1 Hour, 11 Minutes
Course Overview
Descriptive and Inferential Statistics
Population vs. Sample
Probability vs. Non-Probability Sampling
Mean
Median
Mode
IQR
Variance
Exercise: Using Descriptive Statistics
Data Science Statistics: Common Approaches to Sampling Data
Course: 47 Minutes
Course Overview
Terms in Sampling
Sampling Bias
Simple Random Sampling
Systematic Random Sampling
Stratified Sampling
Non-Probability Sampling
Exercise: Efficient and Correct Sampling
Data Science Statistics: Inferential Statistics
Course: 1 Hour, 2 Minutes
Course Overview
Gaussian Distribution
Inferential Statistics and Hypothesis Testing
Simplified Example of Hypothesis Testing
T-tests9
Skewness and Kurtosis
Correlation and Autocorrelation
Introducing Linear Regression
Overfitting and Goodness-of-Fit
Exercise: Basic Inferential Statistics
Accessing Data with Spark: An Introduction to Spark
Course: 1 Hour, 7 Minutes
Course Overview
Introduction to Spark and Hadoop
Resilient Distributed Datasets (RDDs)
RDD Operations
Spark DataFrames
Spark Architecture
Spark Installation
Working with RDDs
Creating DataFrames from RDDs
Contents of a DataFrame
The SQLContext
The map() Function of an RDD
Accessing the Contents of a DataFrame
DataFrames in Spark and Pandas
Exercise: Working with Spark
Getting Started with Hadoop: Fundamentals & MapReduce
Course: 1 Hour, 4 Minutes
Course Overview
An Introduction to Big Data
Building Systems to Scale with Data
A Quick Overview of Hadoop
MapReduce Overview
The Map Phase of a MapReduce
The Shuffle and Reduce Phases
Exercise: Fundamentals of Hadoop and MapReduce
Getting Started with Hadoop: Developing a Basic MapReduce Application
Course: 1 Hour, 14 Minutes
Course Overview
Provisioning a Hadoop Cluster on the Cloud
Browsing the Hadoop Web Applications
Creating a MapReduce project
Coding the Map Phase
Coding the Reduce Phase
Defining the Driver Program
Building the Application
Executing the MapReduce Application
Exercise: Developing a Basic MapReduce Application
Hadoop HDFS: Introduction
Course: 1 Hour, 15 Minutes
Course Overview
Scaling Datasets
Horizontal Scaling for Big Data
Distributed Clusters and Horizontal Scaling
Overview of HDFS
HDFS Architectures
MapReduce for HDFS
YARN for HDFS
The Mechanism of Resource Allocation in Hadoop
Apache Zookeeper for HDFS
The Hadoop Ecosystem
Exercise: An Introduction to HDFS
Hadoop HDFS: Introduction to the Shell
Course: 53 Minutes
Course Overview
Creating a Hadoop Cluster on the Google Cloud
Exploring Hadoop Clusters
The YARN Cluster Manager UI
The HDFS NameNode UI
Browsing the Packaged Hadoop Tools
Configuring HDFS
The HDFS Shells
Exercise: Introduction to the HDFS Shell
Hadoop HDFS: Working with Files
Course: 48 Minutes
Course Overview
Basic Directory Commands in HDFS
Using the copyFromLocal Command in HDFS
Using the put Command in HDFS
Using the copyToLocal Command in HDFS
Retrieving files from HDFS
Append and Delete Operations in HDFS
Exercise: Working with Files on HDFS
Hadoop HDFS: File Permissions
Course: 49 Minutes
Course Overview
The HDFS count and du Commands
Viewing and Setting File Permissions in HDFS
Applying Permissions Recursively in HDFS
An Introduction to Bash Scripting
Scripting HDFS Operations
Exploring the HDFS NameNode UI
Cleanup Operations in HDFS
Exercise: File Permissions on HDFS
Data Silos, Lakes, & Streams: Introduction
Course: 1 Hour, 20 Minutes
Course Overview
Data Silos
Data Lakes
Characteristics of Data Lakes
Data Lake Architecture, Features, and Challenges
Data Warehouses
Data Warehouses vs. Data Lakes
Data Streams
Migrating Data to AWS
Data Lakes on AWS
Working with Data Lakes on AWS
Exercise: Data Silos, Lakes, and Streams
Data Silos, Lakes, and Streams: Data Lakes on AWS
Course: 1 Hour, 10 Minutes
Course Overview
Create a Role for the AWS Glue Service
Upload Data to S
Explore the Glue Web Console
Manually Create Glue Tables
Query the Data Lake Using Amazon Athena
Configure and Run Glue Crawlers
Access Data in Crawled Tables
Crawl Multiple CSV Files in the Same Folder Path
Merge Data in Multiple Files in the Same Folder Path
Work with Files Having the Exact Same Schema
Exercise: Data Lakes on AWS with S3 and Glue
Data Silos, Lakes, & Streams: Sources, Visualizations, & ETL Operations
Course: 1 Hour, 29 Minutes
Course Overview
Set Up a Redshift Cluster
Create Tables and Load Data From S
Establish a JDBC Connection to Redshift
Crawl Redshift Using a JDBC Connection
Crawl DynamoDB
Configure QuickSight to Visualize Data
Visualize Data in QuickSight
Configure a Job to Perform Extract, Transform, Load
Execute an ETL Operation in Glue
Perform ETL to Back Up Redshift Data in S3 Buckets
Perform ETL to Back Up DynamoDB Data in S3 Buckets
Exercise: Multiple Sources, Visualizations, and ETL
Data Analysis Application
Course: 1 Hour, 25 Minutes
Course Overview
Install and Configure Anaconda Python
Install R Using Anaconda
Use Jupyter Notebook
Import and Export Data in Python
Import and Export Data in R
Deal with Missing Data in R
Transform Data in R
Work with Numpy
Work with Pandas
Mean, Median, and Mode in R
Analyze Data with Pandas
Plot Data in R
Visualize Data in Python
Exercise: Perform Data Analysis
Online Mentor
You can reach your Mentor by entering chats or submitting an email.
Final Exam assessment
Estimated duration: 65 minutes
Practice Labs: Analyzing Data with Python (estimated duration: 8 hours)
Practice performing data analysis tasks using Python by configuring VSCode, loading data from SQLite into Pandas, grouping data and using box plots. Then, test your skills by answering assessment questions after using Python to calculate frequency distribution, measures of center, and coefficient of dispersion. This lab provides access to several tools commonly used in data science, including:
VS Code, Anaconda, Jupyter Notebook + Hub, Pandas, NumPy, SiPy, Seaborn Library, Spyder IDE
Data Science Track 2: Data Wrangler
In this track, the focus will be on the data wrangler role. We will explore areas such as: wrangling with Python, Mongo, and Hadoop. Content: E-learning courses
Data Wrangling with Pandas: Working with Series & DataFrames
Course: 1 Hour, 11 Minutes
Course Overview
Installing Pandas
Pandas Series Objects
Operations on Series
Appending and Sorting Series Values
Pandas DataFrames
Indexing Operations with DataFrames
Missing Data
Column Aggregations
Statistical Operations
Data Wrangling with Pandas: Visualizations and Time-Series Data
Course: 1 Hour, 29 Minutes
Course Overview
Pandas and Matplotlib for Visualizations
Pie Charts, Box Plots, and Scatter Plots
Time-Series Data
Deltas and Percentage Change Calculations
Time Deltas and Date Ranges
Mismatched DataFrames and Missing Data
Working with String Data
Advanced Operations on Strings
Applying Functions on Series
Transforming Data With User-Defined Functions
Applying Functions on DataFrames
Exercise: Plot Charts and Transform Column Values
Data Wrangling with Pandas: Advanced Features
Course: 1 Hour, 12 Minutes
Course Overview
Grouping and Aggregations
MultiIndex DataFrames
Grouping and Aggregations with MultiIndex DataFrames
General Aggregation Functions
Filtering
Masking Column Values
Working with Duplicates
Working with Categorical Data
Filtering, Adding, and Removing Categories
Reindexing
Exercise: Filtering, Duplicates and Categorical Data
Data Wrangler 4: Cleaning Data in R
Course: 1 Hour, 3 Minutes
Course Overview
Types of Unclean Data
Data Quality
Downloading JSON Data
Excel Sheets
Reading Dirty CSVs
Querying Relational Databases
Joining Tabular Data
Spreading Data
Summarizing Data
Imputing Data
Extracting Matches
Exercise: Wrangling Data
Data Tools: Technology Landscape & Tools for Data Management
Course: 27 Minutes
Course Overview
Technology Landscape and Tools
Tool Comparison
Machine Learning in Data Analytics
Machine Learning Tools
Machine Learning Implementation
Python and R for Data Management
Cloud and Machine Learning
Exercise: Implement Machine Learning on Scikit-learn
Data Tools: Machine Learning & Deep Learning in the Cloud
Course: 23 Minutes
Course Overview
Microsoft Machine Learning Toolkit
AWS and Machine Learning
Spark Machine Learning Capabilities
Deep Learning Frameworks
Deep Learning Implementation
Data Mining and Analytical Tools
KNIME Capabilities
Exercise: Implement Deep Learning
Trifacta for Data Wrangling: Wrangling Data
Course: 50 Minutes
Course Overview
Standardizing Data
Formatting Dates
Filtering Rows
Replacing Values
Counting Matches
Splitting Columns
Merging Columns
Extracting Data
Conditional Aggregation
Reshaping Data
Joining Data
Exercise: Wrangling Data
MongoDB for Data Wrangling: Querying
Course: 1 Hour, 8 Minutes
Course Overview
Introduction to PyMongo
Document Structure
CRUD Operations
ObjectID and Timestamp
Query Operations
Projection Queries
Comparison Operators
Element Query Operators
The Regex Operator
Using the Size and All Operators
Text Search
Using mongoimport
Using mongoexport
Exercise: Performing a Query
MongoDB for Data Wrangling: Aggregation
Course: 51 Minutes
Course Overview
Aggregation Framework
Using Group
Using Match
Using Project
Using Limit and Sort
Using Unwind
Using Lookup
Using Indexes
Using Geospatial Indexes
Exercise: Performing an Aggregate Query
Getting Started with Hive: Introduction
Course: 56 Minutes
Course Overview
Hive as a Data Warehouse
Overview of Relational Databases
OLTP and OLAP
Hive and the Hadoop Ecosystem
HiveServer and The Metastore
Hive on Cloud Computing Platforms
Data Types in Hive
Data and Tables in Hive
Exercise: Introduction to Hive
Getting Started with Hive: Loading and Querying Data
Course: 1 Hour, 20 Minutes
Course Overview
Setting up a Hadoop Cluster on the Google Cloud
Creating a Hive Table
Running Simple Queries in Hive
Executing Hive Queries from the Shell
Joining Tables in Hive
Exploring the Hive Warehouse
External Tables in Hive
Modifying Tables in Hive
Temporary Tables in Hive
Loading Data into Tables in Hive
Populating Multiple Tables in Hive
Exercise: Loading and Querying Data in Hive
Getting Started with Hive: Viewing and Querying Complex Data
Course: 1 Hour, 14 Minutes
Course Overview
The Array Data Type in Hive
The Map Data Type in Hive
The Struct Type in Hive
The explode and posexplode Functions in Hive
Lateral Views in Hive
Multiple Lateral Views in Hive
Set Operations in Hive
The IN and EXISTS clauses in Hive
Creating and Populating Tables in Hive
Views in Hive
Exercise: Viewing and Querying Complex Data
Getting Started with Hive: Optimizing Query Executions
Course: 43 Minutes
Course Overview
Hive Queries as MapReduce Jobs
Techniques to Improve Query Performance in Hive
Partitioning Tables in Hive
Bucketing Tables in Hive
Structuring Join Queries in Hive
Exercise: Optimizing Query Execution in Hive
Getting Started with Hive: Optimizing Query Executions with Partitioning
Course: 1 Hour, 1 Minute
Course Overview
Setting up a Hadoop Cluster on the Google Cloud
Creating a Partitioned Table in Hive
Working with Partitions in Hive
Populating Partitions in Hive
Partitioning External Tables in Hive
Modifying Partitions in Hive
Dynamic Partitions in Hive
Using Multiple Columns for Partitioning in Hive
Exercise: Optimize Executions with Partitioning
Getting Started with Hive: Bucketing & Window Functions
Course: 1 Hour, 4 Minutes
Course Overview
Apply Bucketing for a Table in Hive
Using Bucketing and Partitioning Together in Hive
Sorting a Bucket's Contents in Hive
Sampling a Table in Hive
Joining Multiple Tables in Hive
Introducing Window Functions in Hive
Windows Functions with Partitions in Hive
Exercise: Bucketing and Window Functions in Hive
Getting Started with Hadoop: Filtering Data Using MapReduce
Course: 59 Minutes
Course Overview
Counting the Data Points in Each Category
The Reducer and Driver Programs
Building and Executing the Application
A Simple Filter Using MapReduce
Executing and Examining the Output
Extracting the Unique Values in a Column
Viewing the Distinct Values Extracted
Exercise: Filtering Data Using MapReduce
Getting Started with Hadoop: MapReduce Applications With Combiners
Course: 1 Hour, 24 Minutes
Course Overview
Combiners in MapReduce
Revisiting MapReduce
Working with Combiners
Using Combiners for Calculating Averages
Creating a Project to Calculate Averages
Coding the Map and Reduce Phases8
Configure the Application in the Driver
Executing the Application and Examining the Output
Adding a Combiner to a MapReduce Application
Conveying a Pair of Numbers from the Mapper
Running the Fixed Application
Exercise: Optimizing MapReduce With Combiners
Getting Started with Hadoop: Advanced Operations Using MapReduce
Course: 49 Minutes
Course Overview
Defining a User-Defined Type for a PriorityQueue
Implementing a PriorityQueue in a Mapper
Using a PriorityQueue in a Reducer
Running and Verifying the Results
Building an Inverted Index - Map Phase
Building an Inverted Index - Reduce Phase
Executing the Application and Viewing the Index
Exercise: Advanced Operations Using MapReduce
Accessing Data with Spark: Data Analysis Using the Spark DataFrame API
Course: 1 Hour, 12 Minutes
Course Overview
Performance Improvements in Spark
Broadcast Variables and Accumulators
Loading Data into a DataFrame
Sampling the Contents of a DataFrame
Grouping and Aggregations
Visualizing Data in a DataFrame
Trimming and Cleaning Data
User-Defined Functions and DataFrames
Combining Filters, Aggregations, and Sorting
Using Broadcast Variables
Using Accumulators
Exporting DataFrame Contents
Custom Accumulators
Join Operations
Exercise: Data Analysis Using the DataFrame API
Accessing Data with Spark: Data Analysis using Spark SQL
Course: 55 Minutes
Course Overview
The Spark Catalyst Optimizer
Introduction to Spark SQL
Preparing Data for Analysis
Running SQL Queries
Inferred and Explicit Schemas
Windowing in Spark
Applying Window Functions
Exercise: Data Analysis Using Spark SQL
Data Lake: Framework & Design Implementation
Course: 34 Minutes
Course Overview
Data Lakes and Data Warehouses
Data Lake Selection Criteria
Data Lake and Data Democratization
Data Lake Design Principles
AWS Data Lake Architecture
Implement AWS Data Store
Data Lake For On-Premise and Multi-Cloud
Data Processing Frameworks for Data Lake
Exercise: Implement AWS Data Store
Data Lake: Architectures & Data Management Principles
Course: 35 Minutes
Course Overview
Real-Time Big Data Architectures
Data Lake Reference Architecture
Data Ingestion and File Formats
Ingestion Using Sqoop
Data Processing Strategies
Deriving Value from Data Lakes
Data Life Cycle
S3 and Glacier
Exercise: Ingest Data and Implement Archival Policy
Data Architecture - Deep Dive: Design & Implementation
Course: 36 Minutes
Course Overview
Data Complexity Management Strategies
Data Modeling Process
Distributed Data Management
Partitioning Methods and Criteria
MongoDB Partitioning
Hybrid Data Architectures
Implement Directed Acyclic Graph
CAP Theorem
Batch vs. Streaming
Read and Write Concerns
Exercise: Implement Serverless Architecture
Data Architecture - Deep Dive: Microservices & Serverless Computing
Course: 26 Minutes
Course Overview
Microservices and Data
Serverless and Lambda Architecture
Lambda Implementation
Cluster Benefits
Data Architecture Types
Data Discovery Process
Data Risk Types
Data POC
Exercise: Implement Lambda Architecture
Online Mentor
You can reach your Mentor by entering chats or submitting an email.
Final Exam assessment
Estimated duration: 90 minutes Nova Learning, januar 2021
Practice Labs: Data Wrangling with Python(estimated duration: 8 hours)
Perform data wrangling tasks including using a Pandas DataFrame to convert multiple Excel sheets to separate JSON documents, extract a table from an HTML file, use mean substitution and convert dates within a DataFrame. Then, test your skills by answering assessment questions after using a Pandas DataFrame to convert a CSV document to a JSON document, replace missing values with a default value, split a column with a delimiter and combine two columns by concatenating text.
Data Science Track 3: Data Ops
The tracks objective is to help prepare the learner for a Data Ops role with a focus on governance, security, and harnessing volume and velocity. Content: E-learning courses
Deploying Data Tools: Data Science Tools
Course: 48 Minutes
Course Overview
Data Science Platform
Challenges of Deploying Data Science Tools
Considerations for Data Science Tools
Data Science Workflow
Data Science Analytic Tools
Data Science Visualization Tools
Data Science Database Tools
Benefits of Deploying Cloud-Based Tools
Challenges of Deploying Cloud-Based Tools
What is DevOps
DevOps for Data Science
Exercise: Identifying Uses of Data Science Tools
Delivering Dashboards: Management Patterns
Course: 34 Minutes
Course Overview
Analytical Visualization
Dashboard Types
Data Management
Dashboard Components
Dashboard Best Practices
Dashboard Using ELK
Dashboard Using Power BI
Chart Selection Criteria
Leaderboards and Scorecards
Scorecard Types
Exercise: Create Dashboards with PowerBI and ELK
Delivering Dashboards: Exploration & Analytics
Course: 31 Minutes
Course Overview
Data Exploration Using Charts
Analytical Visualization Tools
Bar and Line Charts
Dashboarding with Kibana
Dashboard Sharing with Kibana
Dashboarding with Tableau
Dashboarding with Qlikview
Data Ingest and Dashboards
Dashboard Patterns
Monitoring Dashboards
Exercise: Create Dashboards Using Kibana and Tableau
Cloud Data Architecture: DevOps & Containerization
Course: 45 Minutes
Course Overview
Containerization on the Cloud
Benefits of Containers
Serverless Computing
DevOps in the Cloud
AWS OpsWorks
Storage Classification
Cloud and Machine Learning
Cloud and BI Analytics
Exercise: Containerization and Serverless Computing
Compliance Issues and Strategies: Data Compliance
Course: 44 Minutes
Course Overview
Data Compliance Issues
Data Regulations
The Importance of Global Standards
Risk and Company Standards
Myths and Facts of Data Compliance
Compliance Training for Users
Compliance Training for Management
The Benefits of a Data Compliance Program
Elements of a Good Compliance Strategy
Building a Compliance Strategy
Reporting and Response Procedures
Exercise: Explain the Importance of Data Compliance
Implementing Governance Strategies
Course: 46 Minutes
Course Overview
Governance and its Relationship with Big Data
Why Big Data Requires Governance
Requirements for Big Data Governance
Why is Big Data Different?
Identifying Data
Identifying Stakeholders
Cloud Technologies and Data Governance
Designing a Data Governance Process
Managing a Data Governance Strategy
Monitoring a Data Governance Strategy
Maintaining a Data Governance Strategy
Exercise: Defining Data Governance Strategies
Data Access & Governance Policies: Data Access Oversight and IAM
Course: 59 Minutes
Course Overview
Data Access Governance
Risk and Data Safety Compliance
Data Access Patterns
Data Breach Prevention
Least Privilege
Assign and View Effective File System Permissions
Identity and Access Management
Create an AWS IAM User and Group
Assign AWS IAM Group Permissions
Vulnerability Assessments
Implement Effective Security Controls
Exercise: Implement Data Access Governance Solutions
Data Access & Governance Policies: Data Classification, Encryption, and Monitoring
Course: 1 Hour, 19 Minutes
Course Overview
Data Classification
Classify Data Using Microsoft FSRM
Data Encryption
Encrypt Data at Rest
Encrypt Data in Motion
Implement Security Compliance Checking
Examine Data Access Trends
Data Access Monitoring Solutions
Logging, Auditing, and Data Analytics
Configure a Custom Filtered Log View
Enable Windows Data Access Auditing
Exercise: Implement Data Confidentiality
Streaming Data Architectures: An Introduction to Streaming Data
Course: 51 Minutes
Course Overview
Introduction to Streaming data
The Stream Processing Model
The Message Transport
Stream Processing with RDDs
Structured Streaming for Continuous Applications
Streaming vs Structured Streaming
Triggers and Output Modes
Exercise: Working with Streaming Data
Streaming Data Architectures: Processing Streaming Data
Course: 53 Minutes
Course Overview
PySpark Setup
Setting Up a Socket Stream with Netcat
The Update Output Mode
Using a File Input Stream
The Append Output Mode
The Complete Output Mode
Aggregations on Streaming Data
SQL Operations on Streaming Data
User-Defined Functions (UDFs)
Exercise: Processing Streaming Data
Scalable Data Architectures: Introduction
Course: 53 Minutes
Course Overview
Scalable Architectures with Distributed Computing
Introducing Data Warehouses
Contrasting Warehouses with Relational Databases
Data Warehouses for Analytical Processing
Data Warehouse Architectural Components
Amazon Redshift - A Data Warehouse on the Cloud
Exercise: Scalable Data Architectures
Scalable Data Architectures: Introduction to Amazon Redshift
Course: 55 Minutes
Course Overview
Provisioning a Redshift Cluster Using Quick Launch
Creating a Redshift Cluster With Additional Detail
Exploring the Redshift Configs and Metrics
Attaching an IAM Role to a Redshift Cluster
Creating an AWS User to Work With Redshift
Installing and Configuring the AWS CLI
Running Queries from the Redshift Query Editor
Exercise: An Introduction to Amazon Redshift
Scalable Data Architectures: Working with Amazon Redshift & QuickSight
Course: 1 Hour, 18 Minutes
Course Overview
Loading Data from Amazon S3 to a Redshift Cluster
Running Queries and Evaluating Their Execution
Querying a Redshift Cluster Using a SQL client
Working with Automated Snapshots
Restoring Tables from a Snapshot
Horizontal Scaling of a Redshift Cluster
Vertical and Horizontal Scaling of a Cluster
Configuring Access from QuickSight to Redshift
Loading a Dataset to QuickSight
Creating Visualizations with QuickSight
Exercise: Working with Redshift and QuickSight
Building Data Pipelines
Course: 1 Hour, 10 Minutes
Course Overview
Data Pipelines Overview
Traditional ETL Pipeline with Batch Processing
Data Pipeline Tools
Setup and Install Airflow
Apache Airflow
Airflow Workflows
Airflow Tasks
Airflow Dependencies
ETL Pipeline with Airflow
Automated Pipeline without ETL
Airflow Command Line Testing
Exercise: Using Apache Airflow
Data Pipeline: Process Implementation Using Tableau & AWS
Course: 39 Minutes
Course Overview
Data Pipeline
Data Pipeline Processes
Data Pipeline Stages
Data Pipeline Technologies
Data Source Types
Scheduled Data Pipeline
Tableau Server and Utilities
Data Pipeline Using Tableau
Data Pipeline on AWS
Exercise: Build Data Pipelines with Tableau
Data Pipeline: Using Frameworks for Advanced Data Management
Course: 33 Minutes
Course Overview
Celery and Luigi
Data Pipeline with Python Luigi
Working with Dask Library
Dask Arrays
Data Exploration and Visualization Frameworks
Spark and Tableau
Streaming Data Visualization with Python
Data Pipeline Open Source Tools
Exercise: Implement Data Pipelines with Luigi
Data Sources: Integration
Course: 40 Minutes
Course Overview
Elements of IoT Solutions
Service Categories in IoT
IoT Capabilities and Maturity Model
IoT Design Principles
IoT Cloud Architectures
MQTT and XXMP
IoT Controllers
IoT Data Management
Securing IoT
Exercise: Generating Data Streams
Data Sources: Implementing Edge on the Cloud
Course: 31 Minutes
Course Overview
AWS IoT Greengrass
GCP IoT Edge
AWS IoT over WebSockets
IoT Device Simulator
Generating Streams of Data Using MQTT
Exercise: Working with IoT Device Simulators
Securing Big Data Streams
Course: 1 Hour, 3 Minutes
Course Overview
Big Data Security Concerns
Streaming Data Security Concerns
NoSQL Database Security Concerns
Distributed Processing Security Risks
Data Mining and Analytics Privacy Flaws
End-Point Device Tampering Risks
Secure Big Data
Secure Data Streams
Secure Data In Motion
End-Point Input Validation and Filtering
Secure Data at Rest with Symmetric Ciphers
Exercise: Securing Big Data Streams
Harnessing Data Volume & Velocity: Big Data to Smart Data
Course: 39 Minutes
Course Overview
Comparing Big Data and Smart Data
Smart Data and Edge Technologies
Big Data to Smart Data Formation
Smart Data and Smart Processes
Smart Data Use Cases
Smart Data Life Cycle
Big Data to Smart Data Using k-NN
Smart Data Frameworks
Smart Data to Business
Clustering Smart Data
Smart Data Integration
Exercise: Transform Big Data to Smart Data
Data Rollbacks: Transaction Rollbacks & Their Impact
Course: 36 Minutes
Course Overview
Rollback Process
State of Transactions
Transaction Types
SQL Transaction Management
Transaction Log Operations
Deadlock Management
SQL Server Rollback Mechanism
SQL Server Rollback Mechanism Implementation
Exercise: Implement Transactions with SQL Server
Data Rollbacks: Transaction Management & Rollbacks in NoSQL
Course: 29 Minutes
Course Overview
NoSQL and SQL Transaction Management
MongoDB Transactions
Manage Multi-Document Transactions in MongoDB
Change Data Capture
Change Stream in MongoDB
MongoDB Change Stream Implementation
Exercise: MongoDB Transactions and Change Streams
Online Mentor
You can reach your Mentor by entering chats or submitting an email.
Final Exam assessment
Estimated duration: 90 minutes
Practice Labs: Implementing Data Ops with Python (estimated duration: 8 hours)
Perform data ops tasks with Python including working with row subsets, creating new columns with Regex, performing joins and spreading rows. Then, test your skills by answering assessment questions after working with field subsets and computed columns, and performing set operations and binding rows.
Data Science Track 4: Data Scientist
For this track, the focus will be on the Data Scientist role. Here we will explore areas such as: visualization, APIs, and ML and DL algorithms. Content: E-learning courses
Balancing the Four Vs of Data: The Four Vs of Data
Course: 40 Minutes
Course Overview
Overview of the Four Vs
The Importance of Volume
The Importance of Variety
The Importance of Velocity
The Importance of Veracity
The Relationship Between the Four Vs
Variety and Data Structure
Validity and Volatility
Finding Balance in the Four Vs
Use Cases
Extracting Value from the Four Vs
Exercise: Describe the Four Vs of Big Data
Data Driven Organizations
Course: 1 Hour, 15 Minutes
Course Overview
Data Driven Organizations
Decision Making
Analytic Maturity
Analytic Roles
Data Source Priority
Facets of Data Quality
Power BI Data Visualization
Missing Data
Duplicate Data
Truncated Data
Data Provenance
Raw Data to Insights: Data Ingestion & Statistical Analysis
Course: 54 Minutes
Course Overview
Statistical Analysis
Data Correction
Outlier Detection
Data Architecture Pattern
Data Ingestion Tools
Kafka and Apache NiFi
Apache Sqoop Ingest
Ingest Using WaveFront
Raw Data to Insights: Data Management & Decision Making
Course: 57 Minutes
Course Overview
Data-driven Decision Making Framework
Loading Data into R
Preparing Data
Data Correction Approach
Data Correction Using Simple Transformation
Data Correction Using Deductive Correction
Distributed Data Management
Data Analytics
Data Analytics Using R
Predictive Modeling
Tableau Desktop: Real Time Dashboards
Course: 1 Hour, 8 Minutes
Course Overview
Introducing Real Time Dashboards
Creating Real Time Dashboards with Tableau
Build a Tableau Dashboard
Real Time Dashboard Updates in Tableau
Organizing Your Tableau Dashboard
Formatting Your Tableau Dashboard
Interactive Tableau Dashboard
Tableau Dashboard Starters
Tableau Dashboard Extensions
Tableau Dashboards and Story Points
Sharing your Tableau Dashboard
Storytelling with Data: Introduction
Course: 47 Minutes
Course Overview
Storytelling Process
Interpreting Context
Analysis Types
Who, What, and How of Storytelling
Visualization for Storytelling
Graphical Tools for Data Elaboration
Storytelling Scenarios
Storyboarding
Storytelling with Data: Tableau & PowerBI
Course: 57 Minutes
Course Overview
Visual Selection
Slopegraphs
Bar Charts and Types of Bar Charts
Clutter and Clutter Elimination
Gestalt Principle
Story Design Best Practices
Tools for Storytelling
Decluttering
Crafting Visual Data
Visual Design Concerns
Storytelling with Power BI
Model Visual and Tableau
Python for Data Science: Basic Data Visualization Using Seaborn
Course: 1 Hour, 7 Minutes
Course Overview
Introduction to Seaborn
Install Seaborn
Simple Univariate Distributions
Configure Univariate Distribution Plots
Simple Bivariate Distributions
Explore Different Types of Bivariate Distributions
Analyze Multiple Variable Pairs
Regression Plots
Themes and Styles in Seaborn
Python for Data Science: Advanced Data Visualization Using Seaborn
Course: 1 Hour, 4 Minutes
Course Overview
Searching for Patterns in a Dataset
Configuring Plot Aesthetics
Normal Distribution and Outliers
Distributions Within Categories - Part
Distributions Within Categories - Part
Analyzing Categories with Facet Grids - Part
Analyzing Categories with Facet Grids - Part
Introducing Color Palettes
Using Color Palettes
Data Science Statistics: Using Python to Compute & Visualize Statistics
Course: 1 Hour, 16 Minutes
Course Overview
An Introduction to Matplotlib
Analyzing Data Using NumPy and Pandas
Visualizing Univariate and Bivariate Distributions
Summary Statistics Using Native Python Functions
Summary Statistics Using NumPy
Summary Statistics Using the SciPy Library
Correlation and Covariance
Z-score
R for Data Science: Data Visualization
Course: 33 Minutes
Course Overview
An Introduction to Matplotlib
Analyzing Data Using NumPy and Pandas
Visualizing Univariate and Bivariate Distributions
Summary Statistics Using Native Python Functions
Summary Statistics Using NumPy
Summary Statistics Using the SciPy Library
Correlation and Covariance
Z-score
Advanced Visualizations & Dashboards: Visualization Using Python
Course: 38 Minutes
Course Overview
Relevance of Data Visualization for Business
Libraries for Data Visualization in Python
Python Data Visualization Environment Configuration
Matplotlib Libraries for Visualization
Bar Chart Using ggplot
Bokeh and Pygal
Select Visualization Libraries
Interactive Graphs and Image Files
Plot Graphs
Multiple Lines in Graphs
Advanced Visualizations & Dashboards: Visualization Using R
Data Insights, Anomalies, & Verification: Handling Anomalies
Course: 46 Minutes
Course Overview
Data and Anomaly Sources
Decomposition and Forecasting
Examine Data Using Randomization Tests
Anomaly Detection
Anomaly Detection Techniques
Anomaly Detection with scikit-learn
Anomaly Detection Tools
Anomaly Detection Rules
Data Insights, Anomalies, & Verification: Machine Learning & Visualization Tools
Course: 51 Minutes
Course Overview
Machine Learning Anomaly Detection Techniques
Comparing Anomaly Detection Algorithms
Anomaly Detection Using R
Online Anomaly Detection Components
Online Anomaly Detection Approaches
Anomaly Detection Use Cases
Anomaly Detection with Visualization Tools
Anomaly Detection with Mathematical Approaches
Cluster-Based Anomaly Detection
Data Science Statistics: Applied Inferential Statistics
Course: 1 Hour, 19 Minutes
Course Overview
The One-Sample T-test
Independent and Paired T-tests
Testing Hypotheses with T-tests
Loading and Analyzing a Skewed Dataset
Measuring Skewness and Kurtosis
Preparing a Dataset for Regression
Simple Linear Regression
Multiple Linear Regression
Data Research Techniques
Course: 33 Minutes
Course Overview
Data Research Fundamentals
Data Research Steps
Values, Variables, and Observations
JMP Scale of Measurement
Non-experimental and Experimental Research
Descriptive and Inferential Statistical Analysis
Inferential Tests
Case Study of Clinical Data Research
Data Research in Sales Management
Data Research Exploration Techniques
Course: 50 Minutes
Course Overview
Fundamentals of Exploratory Data Analysis
Data Exploration Types
Working with R
Data Exploration in R
Data Exploration Using Plots
Python Packages for Data Exploration
Data Exploration Using Python
Data Research Using Linear Algebra
Linear Algebra for Data Research
Data Research Statistical Approaches
Course: 43 Minutes
Course Overview
Role of Statistics in Data Research
Discrete vs. Continuous Distribution
PDF and CDF
Binomial Distribution
Interval Estimation
Point and Interval Estimation
Data Visualization Techniques
Data Visualization Using R
Data Integration Techniques
Creating Plots
Missing Values and Outliers
Machine & Deep Learning Algorithms: Introduction
Course: 46 Minutes
Course Overview
Machine Learning Algorithms
How Machine Learning Works
Introduction to Pandas ML
Support Vector Machines
Overfitting
Machine & Deep Learning Algorithms: Regression & Clustering
Course: 49 Minutes
Course Overview
The Confusion Matrix
An Introduction to Regression
Applications of Regression
Supervised and Unsupervised Learning
Clustering
Principal Component Analysis
Machine & Deep Learning Algorithms: Data Preperation in Pandas ML
Course: 1 Hour, 4 Minutes
Course Overview
Data Preparation in scikit-learn
Training and Evaluating Models in scikit-learn
Introducing the Pandas ML ModelFrame
Training and Evaluating Models in Pandas ML
Preparing Data for Regression
Evaluating Regression Models
Preparing Data for Clustering
The K-Means Clustering Algorithm
Machine & Deep Learning Algorithms: Imbalanced Datasets Using Pandas ML
Course: 1 Hour, 24 Minutes
Course Overview
Analyzing an Imbalanced Dataset
The RandomOverSampler
The SMOTE Oversampler
Undersampling Using imbalanced-learn
Ensemble Classifiers for Imbalanced Data
Combination Samplers
Finding Correlations in a Dataset
Building a Multi-Label Classification Model
Dimensionality Reduction with PCA
Imbalanced Learn and PCA
Creating Data APIs Using Node.js
Course: 1 Hour, 31 Minutes
Course Overview
API Prerequisites
Building a RESTful API Using Node.js and Express.js
RESTful API with OAuth
HTTP Server with Hapi.js
API Modules
Returning Data with JSON
Nodemon for Development Workflow
API Requests
POSTman for API
Deploying APIs
Social Media APIs
Exercise: Building RESTful APIs
Online Mentor
You can reach your Mentor by entering chats or submitting an email.
Final Exam assessment
Estimated duration: 90 minutes
Practice Labs: Data Visualization with Python(estimated duration: 8 hours)
Perform data visualization tasks with Python such as creating scatter plots, plotting linear regression, using logistic regression and creating decision tree. Then, test your skills by answering assessment questions after creating time-series graphs, resampling observations, creating histograms and using a grid pair.
Ga vandaag nog van start!
Versterk je toekomst in een data-gedreven wereld. ✔️ Bouw sterke vaardigheden in Python, R en big data ✔️ Word een expert in datavisualisatie en -integratie ✔️ Krijg direct toepasbare kennis én een certificaat
Specificaties
Artikelnummer
117950801
SKU
117950801
Taal
Engels
Kwalificaties van de Instructeur
Gecertificeerd
Cursusformaat en Lengte
Lesvideo's met ondertiteling, interactieve elementen en opdrachten en testen
Lesduur
120 uur
Assesments
De assessment test uw kennis en toepassingsvaardigheden van de onderwerpen uit het leertraject. Deze is 365 dagen beschikbaar na activering.
Online mentor
U heeft 24/7 toegang tot een online mentor voor al uw specifieke technische vragen over het studieonderwerp. De online mentor is 365 dagen beschikbaar na activering, afhankelijk van de gekozen Learning Kit.
Online Virtuele labs
Ontvang 12 maanden toegang tot virtuele labs die overeenkomen met de traditionele cursusconfiguratie. Actief voor 365 dagen na activering, beschikbaarheid varieert per Training.
Voortgangsbewaking
Toegang tot Materiaal
365 dagen
Technische Vereisten
Computer of mobiel apparaat, Stabiele internetverbindingen Webbrowserzoals Chrome, Firefox, Safari of Edge.
Support of Ondersteuning
Helpdesk en online kennisbank 24/7
Certificering
Certificaat van deelname in PDF formaat
Prijs en Kosten
Cursusprijs zonder extra kosten
Annuleringsbeleid en Geld-Terug-Garantie
Wij beoordelen dit per situatie
Award Winning E-learning
Tip!
Zorg voor een rustige leeromgeving, tijd en motivatie, audioapparatuur zoals een koptelefoon of luidsprekers voor audio, accountinformatie zoals inloggegevens voor toegang tot het e-learning platform.
Award winning Data Science with Python Masterclass Training met toegang tot een ...
€1.208,79€999,00
Specificaties
Artikelnummer
117950801
SKU
117950801
Taal
Engels
Kwalificaties van de Instructeur
Gecertificeerd
Cursusformaat en Lengte
Lesvideo's met ondertiteling, interactieve elementen en opdrachten en testen
Lesduur
120 uur
Assesments
De assessment test uw kennis en toepassingsvaardigheden van de onderwerpen uit het leertraject. Deze is 365 dagen beschikbaar na activering.
Online mentor
U heeft 24/7 toegang tot een online mentor voor al uw specifieke technische vragen over het studieonderwerp. De online mentor is 365 dagen beschikbaar na activering, afhankelijk van de gekozen Learning Kit.
Online Virtuele labs
Ontvang 12 maanden toegang tot virtuele labs die overeenkomen met de traditionele cursusconfiguratie. Actief voor 365 dagen na activering, beschikbaarheid varieert per Training.
Voortgangsbewaking
Toegang tot Materiaal
365 dagen
Technische Vereisten
Computer of mobiel apparaat, Stabiele internetverbindingen Webbrowserzoals Chrome, Firefox, Safari of Edge.
Support of Ondersteuning
Helpdesk en online kennisbank 24/7
Certificering
Certificaat van deelname in PDF formaat
Prijs en Kosten
Cursusprijs zonder extra kosten
Annuleringsbeleid en Geld-Terug-Garantie
Wij beoordelen dit per situatie
Award Winning E-learning
Tip!
Zorg voor een rustige leeromgeving, tijd en motivatie, audioapparatuur zoals een koptelefoon of luidsprekers voor audio, accountinformatie zoals inloggegevens voor toegang tot het e-learning platform.
Wij gebruiken functionele en analytische cookies om onze website goed te laten werken en het gebruik ervan te meten met Google Analytics. Er worden geen persoonsgegevens gedeeld voor advertentiedoeleinden. Door op "Accepteren" te klikken, geeft u toestemming voor het plaatsen van deze cookies.
Cookies beheren