Wij slaan cookies op om onze website te verbeteren. Is dat akkoord? Ja Nee Meer over cookies »
117950801 SKU 117950801

Data Science with Python Masterclass E-Learning

117950801 SKU 117950801

Data Science with Python Masterclass E-Learning

€1.249,00 999,00 1.208,79 Incl. btw

Award winning Data Science with Python Masterclass E-Learning met toegang tot een online mentor via chats of e-mail, eindexamenbeoordeling en Practice Labs.

Lees meer
Merk:
Python
Kortingen:
  • Koop 2 voor €979,02 per stuk en bespaar 2%
  • Koop 3 voor €969,03 per stuk en bespaar 3%
  • Koop 4 voor €959,04 per stuk en bespaar 4%
  • Koop 5 voor €949,05 per stuk en bespaar 5%
  • Koop 10 voor €899,10 per stuk en bespaar 10%
  • Koop 25 voor €849,15 per stuk en bespaar 15%
  • Koop 50 voor €799,20 per stuk en bespaar 20%
Beschikbaarheid:
Op voorraad
Levertijd:
Bestel voor 16:00 uur en start vandaag.
Blader door onze trainingsmethoden: ICT Trainingen , Data and Databases , Data Science , E-Learning , Learning Journeys
  • Bestel voor 17:00 uur en start vandaag
  • Persoonlijke service van ons deskundige team
  • Veilig betalen
  • Betaal online of op factuur
  • De laagste prijs garantie

Data Science with Python Masterclass E-Learning

Deze reis zal eerst een basis bieden voor gegevensarchitectuur, statistieken en  programmeervaardigheden voor gegevensanalyse met behulp van Python en R, wat de eerste stap zal zijn in het verwerven van de kennis om over te stappen van het gebruik van ongelijksoortige en verouderde gegevensbronnen. Je leert dan om de data te wringen met Python en R en die data te integreren met Spark en Hadoop. Vervolgens leert u hoe u data kunt operationaliseren en schalen, rekening houdend met compliance en governance. Om de reis te voltooien, leert u vervolgens hoe u die gegevens neemt en visualiseert, om slimme zakelijke beslissingen te nemen.

Dit leertraject, met meer dan 120 uur online content, is onderverdeeld in de volgende vier tracks:

Data Science Track 1: Data Analyst
Data Science Track 2: Data Wrangler
Data Science Track 3: Data Ops
Data Science Track 4: Data Scientist

Cursusinhoud

Data Science Track 1: Data Analyst

In this track, the focus is the data analyst role with a focus on: Python, R, architecture, statistics, and Spark.

E-Learning courses

Data Architecture Primer

Course: 1 Hour, 4 Minutes
Course Overview
Data Defined
Data Privacy
The Data Lifecycle
SQL vs. NoSQL
Create an Entity Relationship Diagram
Implement a SQL Solution
Implement a NoSQL Solution
Big Data
Data Architecture and Governance
IT Data System Architecture Types
Data Analytics and Reporting
Exercise: Implement Data Architecture Best Practices

Data Engineering Fundamentals

Course: 46 Minutes
Course Overview
Overview of Distributed Systems
Batch vs. In-Memory Processing
NoSQL Stores
Tools for Data Management
What is ETL?
ETL with Talend Open Studio
Data Modeling
AI and Machine Learning
Data Partitioning
Data Engineering
Data Reporting
Exercise: Create a Data Model

Python for Data Science: Introduction to NumPy for Multi-dimentional Data

Course: 1 Hour
Course Overview
Introduction to NumPy and the NumPy Ecosystem
Array Creation - Part 1
Array Creation - Part 2
Printing Arrays
Basic Array Operations
Universal Functions
Indexing and Slicing
Iterating Over Arrays
Reshaping Arrays
Exercise: Python NumPy Array Operations

Python for Data Science: Advanced Operations with NumPy Arrays

Course: 1 Hour, 8 Minutes
Course Overview
Splitting NumPy Arrays
Images as Arrays
Image Manipulation Using NumPy
Views and NumPy Arrays
Deep Copies of Arrays
Introduction to Index Masks
Applying Index Masks
Indexing with Boolean Masks
Structured Arrays
Understanding Array Broadcasting
Applying Broadcasting Rules on Array Operations
Exercise: NumPy Multi-dimensional Array Operations

Python for Data Science: Introduction to Pandas

Course: 1 Hour, 6 Minutes
Course Overview
Features of Pandas and the Pandas Ecosystem
Introduction to Pandas
Work with Pandas
Introduction to DataFrames
Work with DataFrames
Load Data into a DataFrame
Add and Delete DataFrame Contents
Select Parts of a DataFrame
Access Pandas DataFrames
Introduction to Multi-Indexing in a Dataframe
Reshape DataFrames
Reshape Dataframes Using Stack and Melt Operations
Exercise: Pandas for Basic Tabular Data Manipulation

Python for Data Science: Manipulating and Analyzing Data in Pandas DataFrames

Course: 45 Minutes
Course Overview
Iterating Over the Contents of a DataFrame
Exporting a DataFrame
Sorting
Handling Missing Data
Grouping with a Multi-Index
Merging DataFrames
Applying Join Operations on DataFrames
Pandas and Relational Databases
Exercise: Pandas for Advanced Data Manipulation

R for Data Science: Data Structures

Course: 52 Minutes
Course Overview
Creating Vectors
Manipulating Vectors
Sorting Vectors
Using Lists
Creating Matrices
Matrix Operations
Creating Factors
Creating Data Frames
Data Frame Operations
Exercise: Creating and Using a Data Frame

R for Data Science: Importing and Exporting Data

Course: 34 Minutes
Course Overview
Reading from CSV
Reading from Excel
Reading from HTML
Exporting to CSV
Exporting to Excel
Exporting to HTML
Exercise: Reading and Writing Data

R for Data Science: Data Exploration

Course: 41 Minutes
Course Overview
Creating dplyr Tables
Selecting Subsets
Filtering Tabular Data
Piping Data
Mutating Data
Summarizing Data
Combining Datasets
Grouping Data
Exercise: Querying Data

R for Data Science: Regression Methods

Course: 37 Minutes
Course Overview
Linear Data Preparation
Creating Linear Models
Interpreting Model Output
Using Linear Prediction
Logistic Data Preparation
Using glm
Exercise: Creating a Linear Model

R for Data Science: Classification & Clustering

Course: 39 Minutes
Course Overview
Preparing Data for Classification
Using rpart
Using ctree
Preparing Data for Clustering
Using K-Means Clustering
Using Hierarchical Clustering
Exercise: Creating a Decision Tree

Data Science Statistics: Simple Descriptive Statistics

Course: 1 Hour, 11 Minutes
Course Overview
Descriptive and Inferential Statistics
Population vs. Sample
Probability vs. Non-Probability Sampling
Mean
Median
Mode
IQR
Variance
Exercise: Using Descriptive Statistics

Data Science Statistics: Common Approaches to Sampling Data

Course: 47 Minutes
Course Overview
Terms in Sampling
Sampling Bias
Simple Random Sampling
Systematic Random Sampling
Stratified Sampling
Non-Probability Sampling
Exercise: Efficient and Correct Sampling

Data Science Statistics: Inferential Statistics

Course: 1 Hour, 2 Minutes
Course Overview
Gaussian Distribution
Inferential Statistics and Hypothesis Testing
Simplified Example of Hypothesis Testing
T-tests9
Skewness and Kurtosis
Correlation and Autocorrelation
Introducing Linear Regression
Overfitting and Goodness-of-Fit
Exercise: Basic Inferential Statistics

Accessing Data with Spark: An Introduction to Spark

Course: 1 Hour, 7 Minutes
Course Overview
Introduction to Spark and Hadoop
Resilient Distributed Datasets (RDDs)
RDD Operations
Spark DataFrames
Spark Architecture
Spark Installation
Working with RDDs
Creating DataFrames from RDDs
Contents of a DataFrame
The SQLContext
The map() Function of an RDD
Accessing the Contents of a DataFrame
DataFrames in Spark and Pandas
Exercise: Working with Spark

Getting Started with Hadoop: Fundamentals & MapReduce

Course: 1 Hour, 4 Minutes
Course Overview
An Introduction to Big Data
Building Systems to Scale with Data
A Quick Overview of Hadoop
MapReduce Overview
The Map Phase of a MapReduce
The Shuffle and Reduce Phases
Exercise: Fundamentals of Hadoop and MapReduce

Getting Started with Hadoop: Developing a Basic MapReduce Application

Course: 1 Hour, 14 Minutes
Course Overview
Provisioning a Hadoop Cluster on the Cloud
Browsing the Hadoop Web Applications
Creating a MapReduce project
Coding the Map Phase
Coding the Reduce Phase
Defining the Driver Program
Building the Application
Executing the MapReduce Application
Exercise: Developing a Basic MapReduce Application

Hadoop HDFS: Introduction

Course: 1 Hour, 15 Minutes
Course Overview
Scaling Datasets
Horizontal Scaling for Big Data
Distributed Clusters and Horizontal Scaling
Overview of HDFS
HDFS Architectures
MapReduce for HDFS
YARN for HDFS
The Mechanism of Resource Allocation in Hadoop
Apache Zookeeper for HDFS
The Hadoop Ecosystem
Exercise: An Introduction to HDFS

Hadoop HDFS: Introduction to the Shell

Course: 53 Minutes
Course Overview
Creating a Hadoop Cluster on the Google Cloud
Exploring Hadoop Clusters
The YARN Cluster Manager UI
The HDFS NameNode UI
Browsing the Packaged Hadoop Tools
Configuring HDFS
The HDFS Shells
Exercise: Introduction to the HDFS Shell

Hadoop HDFS: Working with Files

Course: 48 Minutes
Course Overview
Basic Directory Commands in HDFS
Using the copyFromLocal Command in HDFS
Using the put Command in HDFS
Using the copyToLocal Command in HDFS
Retrieving files from HDFS
Append and Delete Operations in HDFS
Exercise: Working with Files on HDFS

Hadoop HDFS: File Permissions

Course: 49 Minutes
Course Overview
The HDFS count and du Commands
Viewing and Setting File Permissions in HDFS
Applying Permissions Recursively in HDFS
An Introduction to Bash Scripting
Scripting HDFS Operations
Exploring the HDFS NameNode UI
Cleanup Operations in HDFS
Exercise: File Permissions on HDFS

Data Silos, Lakes, & Streams: Introduction

Course: 1 Hour, 20 Minutes
Course Overview
Data Silos
Data Lakes
Characteristics of Data Lakes
Data Lake Architecture, Features, and Challenges
Data Warehouses
Data Warehouses vs. Data Lakes
Data Streams
Migrating Data to AWS
Data Lakes on AWS
Working with Data Lakes on AWS
Exercise: Data Silos, Lakes, and Streams

Data Silos, Lakes, and Streams: Data Lakes on AWS

Course: 1 Hour, 10 Minutes
Course Overview
Create a Role for the AWS Glue Service
Upload Data to S
Explore the Glue Web Console
Manually Create Glue Tables
Query the Data Lake Using Amazon Athena
Configure and Run Glue Crawlers
Access Data in Crawled Tables
Crawl Multiple CSV Files in the Same Folder Path
Merge Data in Multiple Files in the Same Folder Path
Work with Files Having the Exact Same Schema
Exercise: Data Lakes on AWS with S3 and Glue

Data Silos, Lakes, & Streams: Sources, Visualizations, & ETL Operations

Course: 1 Hour, 29 Minutes
Course Overview
Set Up a Redshift Cluster
Create Tables and Load Data From S
Establish a JDBC Connection to Redshift
Crawl Redshift Using a JDBC Connection
Crawl DynamoDB
Configure QuickSight to Visualize Data
Visualize Data in QuickSight
Configure a Job to Perform Extract, Transform, Load
Execute an ETL Operation in Glue
Perform ETL to Back Up Redshift Data in S3 Buckets
Perform ETL to Back Up DynamoDB Data in S3 Buckets
Exercise: Multiple Sources, Visualizations, and ETL

Data Analysis Application

Course: 1 Hour, 25 Minutes
Course Overview
Install and Configure Anaconda Python
Install R Using Anaconda
Use Jupyter Notebook
Import and Export Data in Python
Import and Export Data in R
Deal with Missing Data in R
Transform Data in R
Work with Numpy
Work with Pandas
Mean, Median, and Mode in R
Analyze Data with Pandas
Plot Data in R
Visualize Data in Python
Exercise: Perform Data Analysis

Data Science Track 2: Data Wrangler

Data Wrangling with Pandas: Working with Series & DataFrames

Course: 1 Hour, 11 Minutes
Course Overview
Installing Pandas
Pandas Series Objects
Operations on Series
Appending and Sorting Series Values
Pandas DataFrames
Indexing Operations with DataFrames
Missing Data
Column Aggregations
Statistical Operations

Data Wrangling with Pandas: Visualizations and Time-Series Data

Course: 1 Hour, 29 Minutes
Course Overview
Pandas and Matplotlib for Visualizations
Pie Charts, Box Plots, and Scatter Plots
Time-Series Data
Deltas and Percentage Change Calculations
Time Deltas and Date Ranges
Mismatched DataFrames and Missing Data
Working with String Data
Advanced Operations on Strings
Applying Functions on Series
Transforming Data With User-Defined Functions
Applying Functions on DataFrames
Exercise: Plot Charts and Transform Column Values

Data Wrangling with Pandas: Advanced Features

Course: 1 Hour, 12 Minutes
Course Overview
Grouping and Aggregations
MultiIndex DataFrames
Grouping and Aggregations with MultiIndex DataFrames
General Aggregation Functions
Filtering
Masking Column Values
Working with Duplicates
Working with Categorical Data
Filtering, Adding, and Removing Categories
Reindexing
Exercise: Filtering, Duplicates and Categorical Data

Data Wrangler 4: Cleaning Data in R

Course: 1 Hour, 3 Minutes
Course Overview
Types of Unclean Data
Data Quality
Downloading JSON Data
Excel Sheets
Reading Dirty CSVs
Querying Relational Databases
Joining Tabular Data
Spreading Data
Summarizing Data
Imputing Data
Extracting Matches
Exercise: Wrangling Data

Data Tools: Technology Landscape & Tools for Data Management

Course: 27 Minutes
Course Overview
Technology Landscape and Tools
Tool Comparison
Machine Learning in Data Analytics
Machine Learning Tools
Machine Learning Implementation
Python and R for Data Management
Cloud and Machine Learning
Exercise: Implement Machine Learning on Scikit-learn

Data Tools: Machine Learning & Deep Learning in the Cloud

Course: 23 Minutes
Course Overview
Microsoft Machine Learning Toolkit
AWS and Machine Learning
Spark Machine Learning Capabilities
Deep Learning Frameworks
Deep Learning Implementation
Data Mining and Analytical Tools
KNIME Capabilities
Exercise: Implement Deep Learning

Trifacta for Data Wrangling: Wrangling Data

Course: 50 Minutes
Course Overview
Standardizing Data
Formatting Dates
Filtering Rows
Replacing Values
Counting Matches
Splitting Columns
Merging Columns
Extracting Data
Conditional Aggregation
Reshaping Data
Joining Data
Exercise: Wrangling Data

MongoDB for Data Wrangling: Querying

Course: 1 Hour, 8 Minutes
Course Overview
Introduction to PyMongo
Document Structure
CRUD Operations
ObjectID and Timestamp
Query Operations
Projection Queries
Comparison Operators
Element Query Operators
The Regex Operator
Using the Size and All Operators
Text Search
Using mongoimport
Using mongoexport
Exercise: Performing a Query

MongoDB for Data Wrangling: Aggregation

Course: 51 Minutes
Course Overview
Aggregation Framework
Using Group
Using Match
Using Project
Using Limit and Sort
Using Unwind
Using Lookup
Using Indexes
Using Geospatial Indexes
Exercise: Performing an Aggregate Query

Getting Started with Hive: Introduction

Course: 56 Minutes
Course Overview
Hive as a Data Warehouse
Overview of Relational Databases
OLTP and OLAP
Hive and the Hadoop Ecosystem
HiveServer and The Metastore
Hive on Cloud Computing Platforms
Data Types in Hive
Data and Tables in Hive
Exercise: Introduction to Hive

Getting Started with Hive: Loading and Querying Data

Course: 1 Hour, 20 Minutes
Course Overview
Setting up a Hadoop Cluster on the Google Cloud
Creating a Hive Table
Running Simple Queries in Hive
Executing Hive Queries from the Shell
Joining Tables in Hive
Exploring the Hive Warehouse
External Tables in Hive
Modifying Tables in Hive
Temporary Tables in Hive
Loading Data into Tables in Hive
Populating Multiple Tables in Hive
Exercise: Loading and Querying Data in Hive

Getting Started with Hive: Viewing and Querying Complex Data

Course: 1 Hour, 14 Minutes
Course Overview
The Array Data Type in Hive
The Map Data Type in Hive
The Struct Type in Hive
The explode and posexplode Functions in Hive
Lateral Views in Hive
Multiple Lateral Views in Hive
Set Operations in Hive
The IN and EXISTS clauses in Hive
Creating and Populating Tables in Hive
Views in Hive
Exercise: Viewing and Querying Complex Data

Getting Started with Hive: Optimizing Query Executions

Course: 43 Minutes
Course Overview
Hive Queries as MapReduce Jobs
Techniques to Improve Query Performance in Hive
Partitioning Tables in Hive
Bucketing Tables in Hive
Structuring Join Queries in Hive
Exercise: Optimizing Query Execution in Hive

Getting Started with Hive: Optimizing Query Executions with Partitioning

Course: 1 Hour, 1 Minute
Course Overview
Setting up a Hadoop Cluster on the Google Cloud
Creating a Partitioned Table in Hive
Working with Partitions in Hive
Populating Partitions in Hive
Partitioning External Tables in Hive
Modifying Partitions in Hive
Dynamic Partitions in Hive
Using Multiple Columns for Partitioning in Hive
Exercise: Optimize Executions with Partitioning

Getting Started with Hive: Bucketing & Window Functions

Course: 1 Hour, 4 Minutes
Course Overview
Apply Bucketing for a Table in Hive
Using Bucketing and Partitioning Together in Hive
Sorting a Bucket's Contents in Hive
Sampling a Table in Hive
Joining Multiple Tables in Hive
Introducing Window Functions in Hive
Windows Functions with Partitions in Hive
Exercise: Bucketing and Window Functions in Hive

Getting Started with Hadoop: Filtering Data Using MapReduce

Course: 59 Minutes
Course Overview
Counting the Data Points in Each Category
The Reducer and Driver Programs
Building and Executing the Application
A Simple Filter Using MapReduce
Executing and Examining the Output
Extracting the Unique Values in a Column
Viewing the Distinct Values Extracted
Exercise: Filtering Data Using MapReduce

Getting Started with Hadoop: MapReduce Applications With Combiners

Course: 1 Hour, 24 Minutes
Course Overview
Combiners in MapReduce
Revisiting MapReduce
Working with Combiners
Using Combiners for Calculating Averages
Creating a Project to Calculate Averages
Coding the Map and Reduce Phases8
Configure the Application in the Driver
Executing the Application and Examining the Output
Adding a Combiner to a MapReduce Application
Conveying a Pair of Numbers from the Mapper
Running the Fixed Application
Exercise: Optimizing MapReduce With Combiners

Getting Started with Hadoop: Advanced Operations Using MapReduce

Course: 49 Minutes
Course Overview
Defining a User-Defined Type for a PriorityQueue
Implementing a PriorityQueue in a Mapper
Using a PriorityQueue in a Reducer
Running and Verifying the Results
Building an Inverted Index - Map Phase
Building an Inverted Index - Reduce Phase
Executing the Application and Viewing the Index
Exercise: Advanced Operations Using MapReduce

Accessing Data with Spark: Data Analysis Using the Spark DataFrame API

Course: 1 Hour, 12 Minutes
Course Overview
Performance Improvements in Spark
Broadcast Variables and Accumulators
Loading Data into a DataFrame
Sampling the Contents of a DataFrame
Grouping and Aggregations
Visualizing Data in a DataFrame
Trimming and Cleaning Data
User-Defined Functions and DataFrames
Combining Filters, Aggregations, and Sorting
Using Broadcast Variables
Using Accumulators
Exporting DataFrame Contents
Custom Accumulators
Join Operations
Exercise: Data Analysis Using the DataFrame API

Accessing Data with Spark: Data Analysis using Spark SQL

Course: 55 Minutes
Course Overview
The Spark Catalyst Optimizer
Introduction to Spark SQL
Preparing Data for Analysis
Running SQL Queries
Inferred and Explicit Schemas
Windowing in Spark
Applying Window Functions
Exercise: Data Analysis Using Spark SQL

Data Lake: Framework & Design Implementation

Course: 34 Minutes
Course Overview
Data Lakes and Data Warehouses
Data Lake Selection Criteria
Data Lake and Data Democratization
Data Lake Design Principles
AWS Data Lake Architecture
Implement AWS Data Store
Data Lake For On-Premise and Multi-Cloud
Data Processing Frameworks for Data Lake
Exercise: Implement AWS Data Store

Data Lake: Architectures & Data Management Principles

Course: 35 Minutes
Course Overview
Real-Time Big Data Architectures
Data Lake Reference Architecture
Data Ingestion and File Formats
Ingestion Using Sqoop
Data Processing Strategies
Deriving Value from Data Lakes
Data Life Cycle
S3 and Glacier
Exercise: Ingest Data and Implement Archival Policy

Data Architecture - Deep Dive: Design & Implementation

Course: 36 Minutes
Course Overview
Data Complexity Management Strategies
Data Modeling Process
Distributed Data Management
Partitioning Methods and Criteria
MongoDB Partitioning
Hybrid Data Architectures
Implement Directed Acyclic Graph
CAP Theorem
Batch vs. Streaming
Read and Write Concerns
Exercise: Implement Serverless Architecture

Data Architecture - Deep Dive: Microservices & Serverless Computing

Course: 26 Minutes
Course Overview
Microservices and Data
Serverless and Lambda Architecture
Lambda Implementation
Cluster Benefits
Data Architecture Types
Data Discovery Process
Data Risk Types
Data POC
Exercise: Implement Lambda Architecture

Data Science Track 3: Data Ops

Deploying Data Tools: Data Science Tools

Course: 48 Minutes
Course Overview
Data Science Platform
Challenges of Deploying Data Science Tools
Considerations for Data Science Tools
Data Science Workflow
Data Science Analytic Tools
Data Science Visualization Tools
Data Science Database Tools
Benefits of Deploying Cloud-Based Tools
Challenges of Deploying Cloud-Based Tools
What is DevOps
DevOps for Data Science
Exercise: Identifying Uses of Data Science Tools

Delivering Dashboards: Management Patterns

Course: 34 Minutes
Course Overview
Analytical Visualization
Dashboard Types
Data Management
Dashboard Components
Dashboard Best Practices
Dashboard Using ELK
Dashboard Using Power BI
Chart Selection Criteria
Leaderboards and Scorecards
Scorecard Types
Exercise: Create Dashboards with PowerBI and ELK

Delivering Dashboards: Exploration & Analytics

Course: 31 Minutes
Course Overview
Data Exploration Using Charts
Analytical Visualization Tools
Bar and Line Charts
Dashboarding with Kibana
Dashboard Sharing with Kibana
Dashboarding with Tableau
Dashboarding with Qlikview
Data Ingest and Dashboards
Dashboard Patterns
Monitoring Dashboards
Exercise: Create Dashboards Using Kibana and Tableau

Cloud Data Architecture: DevOps & Containerization

Course: 45 Minutes
Course Overview
Containerization on the Cloud
Benefits of Containers
Serverless Computing
DevOps in the Cloud
AWS OpsWorks
Storage Classification
Cloud and Machine Learning
Cloud and BI Analytics
Exercise: Containerization and Serverless Computing

Compliance Issues and Strategies: Data Compliance

Course: 44 Minutes
Course Overview
Data Compliance Issues
Data Regulations
The Importance of Global Standards
Risk and Company Standards
Myths and Facts of Data Compliance
Compliance Training for Users
Compliance Training for Management
The Benefits of a Data Compliance Program
Elements of a Good Compliance Strategy
Building a Compliance Strategy
Reporting and Response Procedures
Exercise: Explain the Importance of Data Compliance

Implementing Governance Strategies

Course: 46 Minutes
Course Overview
Governance and its Relationship with Big Data
Why Big Data Requires Governance
Requirements for Big Data Governance
Why is Big Data Different?
Identifying Data
Identifying Stakeholders
Cloud Technologies and Data Governance
Designing a Data Governance Process
Managing a Data Governance Strategy
Monitoring a Data Governance Strategy
Maintaining a Data Governance Strategy
Exercise: Defining Data Governance Strategies

Data Access & Governance Policies: Data Access Oversight and IAM

Course: 59 Minutes
Course Overview
Data Access Governance
Risk and Data Safety Compliance
Data Access Patterns
Data Breach Prevention
Least Privilege
Assign and View Effective File System Permissions
Identity and Access Management
Create an AWS IAM User and Group
Assign AWS IAM Group Permissions
Vulnerability Assessments
Implement Effective Security Controls
Exercise: Implement Data Access Governance Solutions

Data Access & Governance Policies: Data Classification, Encryption, and Monitoring

Course: 1 Hour, 19 Minutes
Course Overview
Data Classification
Classify Data Using Microsoft FSRM
Data Encryption
Encrypt Data at Rest
Encrypt Data in Motion
Implement Security Compliance Checking
Examine Data Access Trends
Data Access Monitoring Solutions
Logging, Auditing, and Data Analytics
Configure a Custom Filtered Log View
Enable Windows Data Access Auditing
Exercise: Implement Data Confidentiality

Streaming Data Architectures: An Introduction to Streaming Data

Course: 51 Minutes
Course Overview
Introduction to Streaming data
The Stream Processing Model
The Message Transport
Stream Processing with RDDs
Structured Streaming for Continuous Applications
Streaming vs Structured Streaming
Triggers and Output Modes
Exercise: Working with Streaming Data

Streaming Data Architectures: Processing Streaming Data

Course: 53 Minutes
Course Overview
PySpark Setup
Setting Up a Socket Stream with Netcat
The Update Output Mode
Using a File Input Stream
The Append Output Mode
The Complete Output Mode
Aggregations on Streaming Data
SQL Operations on Streaming Data
User-Defined Functions (UDFs)
Exercise: Processing Streaming Data

Scalable Data Architectures: Introduction

Course: 53 Minutes
Course Overview
Scalable Architectures with Distributed Computing
Introducing Data Warehouses
Contrasting Warehouses with Relational Databases
Data Warehouses for Analytical Processing
Data Warehouse Architectural Components
Amazon Redshift - A Data Warehouse on the Cloud
Exercise: Scalable Data Architectures

Scalable Data Architectures: Introduction to Amazon Redshift

Course: 55 Minutes
Course Overview
Provisioning a Redshift Cluster Using Quick Launch
Creating a Redshift Cluster With Additional Detail
Exploring the Redshift Configs and Metrics
Attaching an IAM Role to a Redshift Cluster
Creating an AWS User to Work With Redshift
Installing and Configuring the AWS CLI
Running Queries from the Redshift Query Editor
Exercise: An Introduction to Amazon Redshift

Scalable Data Architectures: Working with Amazon Redshift & QuickSight

Course: 1 Hour, 18 Minutes
Course Overview
Loading Data from Amazon S3 to a Redshift Cluster
Running Queries and Evaluating Their Execution
Querying a Redshift Cluster Using a SQL client
Working with Automated Snapshots
Restoring Tables from a Snapshot
Horizontal Scaling of a Redshift Cluster
Vertical and Horizontal Scaling of a Cluster
Configuring Access from QuickSight to Redshift
Loading a Dataset to QuickSight
Creating Visualizations with QuickSight
Exercise: Working with Redshift and QuickSight

Building Data Pipelines

Course: 1 Hour, 10 Minutes
Course Overview
Data Pipelines Overview
Traditional ETL Pipeline with Batch Processing
Data Pipeline Tools
Setup and Install Airflow
Apache Airflow
Airflow Workflows
Airflow Tasks
Airflow Dependencies
ETL Pipeline with Airflow
Automated Pipeline without ETL
Airflow Command Line Testing
Exercise: Using Apache Airflow

Data Pipeline: Process Implementation Using Tableau & AWS

Course: 39 Minutes
Course Overview
Data Pipeline
Data Pipeline Processes
Data Pipeline Stages
Data Pipeline Technologies
Data Source Types
Scheduled Data Pipeline
Tableau Server and Utilities
Data Pipeline Using Tableau
Data Pipeline on AWS
Exercise: Build Data Pipelines with Tableau

Data Pipeline: Using Frameworks for Advanced Data Management

Course: 33 Minutes
Course Overview
Celery and Luigi
Data Pipeline with Python Luigi
Working with Dask Library
Dask Arrays
Data Exploration and Visualization Frameworks
Spark and Tableau
Streaming Data Visualization with Python
Data Pipeline Open Source Tools
Exercise: Implement Data Pipelines with Luigi

Data Sources: Integration

Course: 40 Minutes
Course Overview
Elements of IoT Solutions
Service Categories in IoT
IoT Capabilities and Maturity Model
IoT Design Principles
IoT Cloud Architectures
MQTT and XXMP
IoT Controllers
IoT Data Management
Securing IoT
Exercise: Generating Data Streams

Data Sources: Implementing Edge on the Cloud

Course: 31 Minutes
Course Overview
AWS IoT Greengrass
GCP IoT Edge
AWS IoT over WebSockets
IoT Device Simulator
Generating Streams of Data Using MQTT
Exercise: Working with IoT Device Simulators

Securing Big Data Streams

Course: 1 Hour, 3 Minutes
Course Overview
Big Data Security Concerns
Streaming Data Security Concerns
NoSQL Database Security Concerns
Distributed Processing Security Risks
Data Mining and Analytics Privacy Flaws
End-Point Device Tampering Risks
Secure Big Data
Secure Data Streams
Secure Data In Motion
End-Point Input Validation and Filtering
Secure Data at Rest with Symmetric Ciphers
Exercise: Securing Big Data Streams

Harnessing Data Volume & Velocity: Big Data to Smart Data

Course: 39 Minutes
Course Overview
Comparing Big Data and Smart Data
Smart Data and Edge Technologies
Big Data to Smart Data Formation
Smart Data and Smart Processes
Smart Data Use Cases
Smart Data Life Cycle
Big Data to Smart Data Using k-NN
Smart Data Frameworks
Smart Data to Business
Clustering Smart Data
Smart Data Integration
Exercise: Transform Big Data to Smart Data

Data Rollbacks: Transaction Rollbacks & Their Impact

Course: 36 Minutes
Course Overview
Rollback Process
State of Transactions
Transaction Types
SQL Transaction Management
Transaction Log Operations
Deadlock Management
SQL Server Rollback Mechanism
SQL Server Rollback Mechanism Implementation
Exercise: Implement Transactions with SQL Server

Data Rollbacks: Transaction Management & Rollbacks in NoSQL

Course: 29 Minutes
Course Overview
NoSQL and SQL Transaction Management
MongoDB Transactions
Manage Multi-Document Transactions in MongoDB
Change Data Capture
Change Stream in MongoDB
MongoDB Change Stream Implementation
Exercise: MongoDB Transactions and Change Streams

Data Science Track 4: Data Scientist

Balancing the Four Vs of Data: The Four Vs of Data

Course: 40 Minutes
Course Overview
Overview of the Four Vs
The Importance of Volume
The Importance of Variety
The Importance of Velocity
The Importance of Veracity
The Relationship Between the Four Vs
Variety and Data Structure
Validity and Volatility
Finding Balance in the Four Vs
Use Cases
Extracting Value from the Four Vs
Exercise: Describe the Four Vs of Big Data

Data Driven Organizations

Course: 1 Hour, 15 Minutes
Course Overview
Data Driven Organizations
Decision Making
Analytic Maturity
Analytic Roles
Data Source Priority
Facets of Data Quality
Power BI Data Visualization
Missing Data
Duplicate Data
Truncated Data
Data Provenance

Raw Data to Insights: Data Ingestion & Statistical Analysis

Course: 54 Minutes
Course Overview
Statistical Analysis
Data Correction
Outlier Detection
Data Architecture Pattern
Data Ingestion Tools
Kafka and Apache NiFi
Apache Sqoop Ingest
Ingest Using WaveFront

Raw Data to Insights: Data Management & Decision Making

Course: 57 Minutes
Course Overview
Data-driven Decision Making Framework
Loading Data into R
Preparing Data
Data Correction Approach
Data Correction Using Simple Transformation
Data Correction Using Deductive Correction
Distributed Data Management
Data Analytics
Data Analytics Using R
Predictive Modeling

Tableau Desktop: Real Time Dashboards

Course: 1 Hour, 8 Minutes
Course Overview
Introducing Real Time Dashboards
Creating Real Time Dashboards with Tableau
Build a Tableau Dashboard
Real Time Dashboard Updates in Tableau
Organizing Your Tableau Dashboard
Formatting Your Tableau Dashboard
Interactive Tableau Dashboard
Tableau Dashboard Starters
Tableau Dashboard Extensions
Tableau Dashboards and Story Points
Sharing your Tableau Dashboard

Storytelling with Data: Introduction

Course: 47 Minutes
Course Overview
Storytelling Process
Interpreting Context
Analysis Types
Who, What, and How of Storytelling
Visualization for Storytelling
Graphical Tools for Data Elaboration
Storytelling Scenarios
Storyboarding

Storytelling with Data: Tableau & PowerBI

Course: 57 Minutes
Course Overview
Visual Selection
Slopegraphs
Bar Charts and Types of Bar Charts
Clutter and Clutter Elimination
Gestalt Principle
Story Design Best Practices
Tools for Storytelling
Decluttering
Crafting Visual Data
Visual Design Concerns
Storytelling with Power BI
Model Visual and Tableau

Python for Data Science: Basic Data Visualization Using Seaborn

Course: 1 Hour, 7 Minutes
Course Overview
Introduction to Seaborn
Install Seaborn
Simple Univariate Distributions
Configure Univariate Distribution Plots
Simple Bivariate Distributions
Explore Different Types of Bivariate Distributions
Analyze Multiple Variable Pairs
Regression Plots
Themes and Styles in Seaborn

Python for Data Science: Advanced Data Visualization Using Seaborn

Course: 1 Hour, 4 Minutes
Course Overview
Searching for Patterns in a Dataset
Configuring Plot Aesthetics
Normal Distribution and Outliers
Distributions Within Categories - Part
Distributions Within Categories - Part
Analyzing Categories with Facet Grids - Part
Analyzing Categories with Facet Grids - Part
Introducing Color Palettes
Using Color Palettes

Data Science Statistics: Using Python to Compute & Visualize Statistics

Course: 1 Hour, 16 Minutes
Course Overview
An Introduction to Matplotlib
Analyzing Data Using NumPy and Pandas
Visualizing Univariate and Bivariate Distributions
Summary Statistics Using Native Python Functions
Summary Statistics Using NumPy
Summary Statistics Using the SciPy Library
Correlation and Covariance
Z-score

R for Data Science: Data Visualization

Course: 33 Minutes
Course Overview
An Introduction to Matplotlib
Analyzing Data Using NumPy and Pandas
Visualizing Univariate and Bivariate Distributions
Summary Statistics Using Native Python Functions
Summary Statistics Using NumPy
Summary Statistics Using the SciPy Library
Correlation and Covariance
Z-score

Advanced Visualizations & Dashboards: Visualization Using Python

Course: 38 Minutes
Course Overview
Relevance of Data Visualization for Business
Libraries for Data Visualization in Python
Python Data Visualization Environment Configuration
Matplotlib Libraries for Visualization
Bar Chart Using ggplot
Bokeh and Pygal
Select Visualization Libraries
Interactive Graphs and Image Files
Plot Graphs
Multiple Lines in Graphs

Advanced Visualizations & Dashboards: Visualization Using R

Course: 35 Minutes
Course Overview
Chart Types
Stacked Bar Plot
Animate Plots with Matplotlib
Plotting in Jupyter Notebook
Graphics in R
Heat Map and Scatter Plot in R
Correlogram and Area Chart in R
ggplot2 Capabilities
Customize ggplot2 Graphs

Powering Recommendation Engines: Recommendation Engines

Course: 1 Hour, 5 Minutes
Course Overview
Describing Recommendation Engines
Comparing the Types of Recommendation Engines
Collecting and Manipulating Data
Manipulating Data in R
Describing Similarity and Neighborhoods
Creating a Recommendation Engine
Recommending Another Item
Finding Items to Recommend
Recommending Items Based on Other Items
Evaluating a Recommendation System
Validating a Recommendation System

Data Insights, Anomalies, & Verification: Handling Anomalies

Course: 46 Minutes
Course Overview
Data and Anomaly Sources
Decomposition and Forecasting
Examine Data Using Randomization Tests
Anomaly Detection
Anomaly Detection Techniques
Anomaly Detection with scikit-learn
Anomaly Detection Tools
Anomaly Detection Rules

Data Insights, Anomalies, & Verification: Machine Learning & Visualization Tools

Course: 51 Minutes
Course Overview
Machine Learning Anomaly Detection Techniques
Comparing Anomaly Detection Algorithms
Anomaly Detection Using R
Online Anomaly Detection Components
Online Anomaly Detection Approaches
Anomaly Detection Use Cases
Anomaly Detection with Visualization Tools
Anomaly Detection with Mathematical Approaches
Cluster-Based Anomaly Detection

Data Science Statistics: Applied Inferential Statistics

Course: 1 Hour, 19 Minutes
Course Overview
The One-Sample T-test
Independent and Paired T-tests
Testing Hypotheses with T-tests
Loading and Analyzing a Skewed Dataset
Measuring Skewness and Kurtosis
Preparing a Dataset for Regression
Simple Linear Regression
Multiple Linear Regression

Data Research Techniques

Course: 33 Minutes
Course Overview
Data Research Fundamentals
Data Research Steps
Values, Variables, and Observations
JMP Scale of Measurement
Non-experimental and Experimental Research
Descriptive and Inferential Statistical Analysis
Inferential Tests
Case Study of Clinical Data Research
Data Research in Sales Management

Data Research Exploration Techniques

Course: 50 Minutes
Course Overview
Fundamentals of Exploratory Data Analysis
Data Exploration Types
Working with R
Data Exploration in R
Data Exploration Using Plots
Python Packages for Data Exploration
Data Exploration Using Python
Data Research Using Linear Algebra
Linear Algebra for Data Research

Data Research Statistical Approaches

Course: 43 Minutes
Course Overview
Role of Statistics in Data Research
Discrete vs. Continuous Distribution
PDF and CDF
Binomial Distribution
Interval Estimation
Point and Interval Estimation
Data Visualization Techniques
Data Visualization Using R
Data Integration Techniques
Creating Plots
Missing Values and Outliers

Machine & Deep Learning Algorithms: Introduction

Course: 46 Minutes
Course Overview
Machine Learning Algorithms
How Machine Learning Works
Introduction to Pandas ML
Support Vector Machines
Overfitting

Machine & Deep Learning Algorithms: Regression & Clustering

Course: 49 Minutes
Course Overview
The Confusion Matrix
An Introduction to Regression
Applications of Regression
Supervised and Unsupervised Learning
Clustering
Principal Component Analysis

Machine & Deep Learning Algorithms: Data Preperation in Pandas ML

Course: 1 Hour, 4 Minutes
Course Overview
Data Preparation in scikit-learn
Training and Evaluating Models in scikit-learn
Introducing the Pandas ML ModelFrame
Training and Evaluating Models in Pandas ML
Preparing Data for Regression
Evaluating Regression Models
Preparing Data for Clustering
The K-Means Clustering Algorithm

Machine & Deep Learning Algorithms: Imbalanced Datasets Using Pandas ML

Course: 1 Hour, 24 Minutes
Course Overview
Analyzing an Imbalanced Dataset
The RandomOverSampler
The SMOTE Oversampler
Undersampling Using imbalanced-learn
Ensemble Classifiers for Imbalanced Data
Combination Samplers
Finding Correlations in a Dataset
Building a Multi-Label Classification Model
Dimensionality Reduction with PCA
Imbalanced Learn and PCA

Creating Data APIs Using Node.js

Course: 1 Hour, 31 Minutes
Course Overview
API Prerequisites
Building a RESTful API Using Node.js and Express.js
RESTful API with OAuth
HTTP Server with Hapi.js
API Modules
Returning Data with JSON
Nodemon for Development Workflow
API Requests
POSTman for API
Deploying APIs
Social Media APIs
Exercise: Building RESTful APIs

Online Mentor

You can reach your Mentor by entering chats or submitting an email.

Final Exam assessment

Estimated duration: 90 minutes

Practice Labs: Data Visualization with Python (estimated duration: 8 hours)

Perform data visualization tasks with Python such as creating scatter plots, plotting linear regression, using logistic regression and creating decision tree. Then, test your skills by answering assessment questions after creating time-series graphs, resampling observations, creating histograms and using a grid pair.

Duur 120 uur
Taal Engels
Certificaat van deelname Ja
Online toegang 365 dagen
Voortgangsbewaking Ja
Award Winning E-learning Ja
Geschikt voor mobiel Ja

Er zijn nog geen reviews geschreven over dit product.

Beoordelingen

Er zijn nog geen reviews geschreven over dit product.

Microsoft Office SCORM e-Learning

Wilt u Microsoft Office e-Learning SCORM hosten in het LMS van uw organisatie? Neem contact met ons op.

Cursisten

Duizenden tevreden cursisten bij OEM

Gemiddelde beoordeling

Springest: 8.7, Webwinkelkeur: 7.8

20+ jaar ervaring

Ook wij blijven leren

Kwaliteit

Bekroonde E-Learning & Gecertificeerde Docenten