Menu
EUR
Bespaar uren per week met Microsoft Copilot. Bekijk onze nieuwe AI-trainingen. Direct starten
OEM SRE DRE Toolbox Training
€239,58 €198,00
In shopping cart
OEM SRE DRE Toolbox Training
OEM
(0)
OEM SRE DRE Toolbox Training

OEM SRE DRE Toolbox Training

€239,58 €198,00 Incl. tax Excl. tax
In stock

Learn to build and manage reliable IT and data systems with SRE and DRE. This ICT training course covers monitoring, incident management, CI/CD, cloud, multicloud, Prometheus, Grafana, ELK, Puppet and data reliability. Read more.

Make a choice
SRE DRE Toolbox Training
163407346
In stock
163407346
€239,58 €198,00
  • Officieel erkend testcentrum
    Online of fysiek examen afnemen
  • Bekroonde e-learning
    Inclusief proefexamens en 24/7 begeleiding
  • ISO 9001 & 27001 werkwijze
    2.500+ organisaties gingen u voor
  • Maatwerk & gratis nulmeting
    Altijd op het juiste niveau gestart

Product description

Site Reliability Engineering/Data Reliability Engineering (SRE/DRE) Toolbox Training

Learn to build and manage reliable IT and data systems with SRE and DRE. This ICT training course covers monitoring, incident management, CI/CD, cloud, multicloud, Prometheus, Grafana, ELK, Puppet and data reliability.

This online ICT training course is designed for professionals who want to build job-ready knowledge and practical skills in a structured way. The course combines clear theory, hands-on learning, assessments and a complete learning path that supports both individual learners and organizations investing in future-proof IT capabilities.

This LearningKit with more than 29 hours of learning is divided into three tracks:

Demo Site Reliability Engineering/Data Reliability Engineering (SRE/DRE) Toolbox Training

Why follow this ICT training course?

Modern IT teams need people who understand automation, AI, cloud, DevOps, reliability and governance. This OEM training helps learners move beyond theory and apply the concepts in realistic professional environments. The result is a stronger foundation for improving IT operations, building reliable systems and supporting innovation.

What you will learn

  • Apply SRE principles, observability, monitoring and alerting
  • Organize incident management, postmortems and continuous improvement
  • Understand Prometheus, Grafana, ELK Stack, Spinnaker, Puppet, Consul and ZooKeeper
  • Use cloud, multicloud and deployment strategies for reliable systems
  • Apply Data Reliability Engineering for reliable and scalable data ecosystems

Training structure

Track 1: Site Reliability Engineering Foundations

Build the foundation for reliable services with SRE principles, observability, monitoring, Prometheus, Grafana, Alertmanager, logging and Google Cloud Operations.

  • Introduction to SRE and Essential Tools
  • Implementing SRE Best Practices with Tools
  • Site Reliability Engineering Network Optimization
  • Site Reliability Engineering Observability
  • Comprehensive Monitoring with Prometheus
  • Comprehensive Monitoring with Grafana
  • Alerting and Logging with Alertmanager
  • Alerting and Logging with Google Cloud Operations

Introduction to SRE and Essential Tools

Course: 1 Hour, 35 Minutes

  • Course Overview
  • Site Reliability Engineers
  • The Evolution of Site Reliability Engineering (SRE)
  • Site Reliability Engineering Role
  • Site Reliability Engineering Principles
  • Key Site Reliability Engineering Metrics
  • Error Budgeting
  • Essential Site Reliability Engineering Tools
  • SRE vs. IT Tools
  • Site Reliability Engineering Lifecycle
  • Incident Response and Postmortem Analysis
  • Automation and System Reliability
  • Cultural Impacts of SRE
  • Using Monitoring Tools
  • Using Dashboards in Grafana
  • Course Summary

Implementing SRE Best Practices with Tools

Course: 1 Hour, 8 Minutes

  • Course Overview
  • Monitoring and Alerting Best Practices
  • Define Service-level Objectives (SLOs) and Service-level indicators (SLIs)
  • Incident Management
  • Automation Tools
  • Integration with Workflows
  • Capacity Planning and Resource Allocation
  • Service-level Indicators
  • Continuous Improvement
  • Implementing an Incident Response Simulation
  • Automating Routine Maintenance Tasks
  • Course Summary

Site Reliability Engineering Network Optimization

Course: 1 Hour, 26 Minutes

  • Course Overview
  • Network Optimization for Site Reliability Engineering (SRE)
  • Common Network Bottlenecks
  • Network Performance and Latency
  • Network Performance Optimization
  • Network Design and Service Reliability
  • Redundant Network Pathway Strategies
  • Network Troubleshooting and Diagnosis
  • Network Monitoring and Management Tools
  • Network Load Balancing and Traffic Management
  • Network Communication Security
  • Conducting Wireshark Network Performance Analysis
  • Implementing Linux IPVS Load Balancing
  • Course Summary

Site Reliability Engineering Observability

Course: 1 Hour, 31 Minutes

  • Course Overview
  • Site Reliability Engineering (SRE) Observability
  • SRE Observability Pillars
  • SRE Observability Tools
  • SRE Distributed System Observability
  • Log Management and Analysis Strategies
  • Network Metric Collection and Analysis
  • Network Trace Analysis
  • Network Observability Use Cases
  • Network Observability Alerts
  • Network Observability Root Cause Analysis
  • Setting up .Net Core Logging
  • Configuring Datadog Monitoring and Alerting
  • Working with Microsoft Network Analyzer
  • Course Summary

Comprehensive Monitoring with Prometheus

Course: 1 Hour, 22 Minutes

  • Course Overview
  • Prometheus Monitoring
  • Prometheus Characteristics and Components
  • The Prometheus Data Model
  • Prometheus Metric Types
  • Prometheus Jobs and Instances
  • Structure of Labels in Prometheus
  • Guidelines for Prometheus Consoles and Dashboards
  • Long-Term Storage in Prometheus
  • Storage and Performance Optimization
  • Scaling Prometheus Monitoring
  • Prometheus at Scale
  • Installing Prometheus
  • Configuring Prometheus
  • Course Summary

Comprehensive Monitoring with Grafana

Course: 1 Hour, 10 Minutes

  • Course Overview
  • The Grafana Platform
  • Grafana Dashboard Design Best Practices
  • Grafana Dashboard Security
  • Installing Grafana
  • Connecting Prometheus in Grafana
  • Using Grafana's Query Editor
  • Designing Dashboards in Grafana
  • Leveraging the Grafana API
  • Using Annotations in Grafana
  • Creating Alerts in Grafana
  • Course Summary

Alerting and Logging with Alertmanager

Course: 1 Hour, 10 Minutes

  • Course Overview
  • Prometheus Alerting
  • Prometheus Alertmanager Best Practices
  • Prometheus Alertmanager Alert Grouping
  • Prometheus Alertmanager Inhibition
  • Prometheus Alertmanager Silences
  • Prometheus Alertmanager High Availability
  • Notification Templates
  • Installing Alertmanager
  • Configuring Alertmanager
  • Setting Up and Testing Alerts in Alertmanager
  • Leveraging Alertmanager’s Management API
  • Course Summary

Alerting and Logging with Google Cloud Operations

Course: 1 Hour, 14 Minutes

  • Course Overview
  • Google Cloud Operations
  • Google Cloud Observability
  • Logging Use Cases
  • Cloud Monitoring Metrics, Time Series Data, and Resources
  • Cloud Audit Logs
  • Hybrid and Multicloud Deployments
  • Google Cloud Operations
  • Querying Logs in Google Cloud
  • Creating Metric-Threshold Alert Policies
  • Leveraging the Cloud Monitoring Dashboard API
  • Course Summary

Assessment: Final Exam: Site Reliability Engineering Foundations

Track 2: Site Reliability Engineering Management

Develop advanced operational leadership skills for incident management, postmortems, Elastic Stack, capacity planning, load testing and performance optimization.

  • SRE Incident Management: Deep Dives, Postmortems, & Continuous Improvement
  • SRE Incident Management: Fundamentals & Best Practices
  • Elastic (ELK) Stack for Log Management
  • Advanced Techniques in Elastic Stack
  • SRE: Capacity Planning & Load Testing Essentials
  • SRE: Advanced Capacity Planning & Performance Optimization

SRE Incident Management: Fundamentals & Best Practices

Course: 1 Hour, 20 Minutes

  • Course Overview
  • Fundamentals of SRE Incident Management
  • The SRE Incident Response Team
  • SRE Incident Management Procedures
  • SRE Incident Management Communication Planning
  • SRE Incident Management Documentation
  • SRE Incident Management Tracking
  • SRE Best Practices for Incident Triage
  • SRE Incident Response Team Wellness
  • SRE Incident Postmortem
  • SRE Incident Continuous Improvement
  • Implementing an Incident Response Simulation
  • Course Summary

SRE Incident Management: Deep Dives, Postmortems, & Continuous Improvement

Course: 1 Hour, 42 Minutes

  • Course Overview
  • SRE Incident Analysis
  • SRE Incident Response and Postmortem Analysis
  • Postmortem Meeting Facilitation
  • SRE Incident Continuous Improvement
  • SRE Response Effectiveness Measurements
  • SRE Psychological Safety and Communication
  • SRE Incident Response Awareness Training
  • SRE Tool and Automation Enhancement
  • SRE Incident Response Organizational Awareness
  • SRE Incident Responsiveness Improvements
  • Course Summary

Elastic (ELK) Stack for Log Management

Course: 1 Hour, 26 Minutes

  • Course Overview
  • ELK Stack Log Management
  • Installing and Configuring Elasticsearch
  • Installing and Configuring the Kibana Visualization Tool
  • Installing Logstash and Configuring the Data Processing Pipeline
  • Configuring Logstash to Process Log Events
  • Exploring Kibana Dashboards
  • Logstash Pipelines
  • Using Elasticsearch Query Domain Specific Language (DSL)
  • Advanced Elasticsearch Query Development
  • Creating Dynamic Dashboards in Kibana
  • Elasticsearch Integration
  • Course Summary

Advanced Techniques in Elastic Stack

Course: 1 Hour, 20 Minutes

  • Course Overview
  • Elastic Stack Security
  • Securing the Elastic Stack with RBAC
  • Advanced Logstash Filters
  • Implementing Rules and Alerts with Elastic Stack
  • Elasticsearch Optimization
  • Elasticsearch ML Features
  • Configuring Elastic Machine Learning
  • Creating Elasticsearch Anomaly Detection
  • High Availability Elasticsearch Clusters
  • Configuring Scalable and Highly Available Clusters
  • Elastic Stack Audit
  • Course Summary

SRE: Capacity Planning & Load Testing Essentials

Course: 1 Hour, 2 Minutes

  • Course Overview
  • Site Reliability Engineering Capacity Planning
  • Site Reliability Engineering Load Testing
  • Performance and Capacity Metrics
  • Load Testing
  • Load Testing Tools
  • Load Testing Analysis
  • Load Test Scenarios
  • Load Testing Strategy
  • Scaling Strategies
  • Course Summary

SRE: Advanced Capacity Planning & Performance Optimization

Course: 1 Hour

  • Course Overview
  • Designing Complex Load Test Scenarios
  • Advanced Performance Analysis Techniques
  • Business Continuity and Strategic Planning
  • Predictive Analytics and Modeling Techniques
  • Continuous Capacity Automation Frameworks
  • Capacity Planning Case Studies
  • Infrastructure Cost Optimization
  • Proactive Monitoring Strategies
  • Performance Testing Reports
  • Course Summary

Assessment: Final Exam: Site Reliability Engineering Management

Track 3: Site Reliability Engineering Tools

Gain tool-centric expertise with Spinnaker, Puppet, ZooKeeper, Consul, cloud services, deployment fundamentals and advanced multicloud strategies.

  • Spinnaker and Deployment Fundamentals
  • Advanced Spinnaker Deployment Strategies and Security
  • Puppet Essentials and Configuration Management Basics
  • Advanced Puppet Configuration and Automation Techniques
  • Site Reliability Engineering: Apache ZooKeeper for Distributed Systems
  • Site Reliability Engineering: Consul for Service Discovery and Configuration
  • Site Reliability Engineering Toolbox: Cloud Services & Deployment Fundamentals
  • Site Reliability Engineering Toolbox: Advanced Multicloud Strategies and Best Practices

Spinnaker and Deployment Fundamentals

Course: 1 Hour, 34 Minutes

  • Course Overview
  • Spinnaker Fundamentals
  • Installing and Configuring Spinnaker
  • Creating Spinnaker Deployment Pipelines
  • Integrating Spinnake
  • Spinnaker Cloud Deployments
  • Spinnaker Triggers and Conditional Deployments
  • Monitoring in Spinnaker
  • Customizing Spinnaker Pipelines
  • Course Summary

Advanced Spinnaker Deployment Strategies and Security

Course: 1 Hour, 42 Minutes

  • Course Overview
  • Rollbacks and Staged Rollouts in Spinnaker
  • Access Control in Spinnaker
  • Pipeline Stages in Spinnaker
  • Designing Custom Spinnaker Pipeline Stages
  • Utilizing Spinnaker Webhooks
  • Spinnaker Plugins
  • Advanced Deployment Strategies
  • Security Scans and Compliance Checks in Spinnaker
  • Optimizing Spinnaker Pipeline Templates
  • Spinnaker Disaster Recovery Strategies
  • Course Summary

Puppet Essentials and Configuration Management Basics

Course: 57 Minutes

  • Course Overview
  • Puppet Fundamentals
  • Accessing Hardened Puppet Core Repositories
  • Generating Puppet Infrastructure
  • Configuring a Puppet Master and Agent Setup
  • Writing Puppet Manifest Configurations
  • Puppet Components
  • Puppet Configuration Management
  • Implementing Puppet Version Control
  • Classify Puppet Nodes
  • Puppet Configuration Troubleshooting
  • Course Summary

Advanced Puppet Configuration and Automation Techniques

Course: 52 Minutes

  • Course Overview
  • Puppet Complex Configurations
  • Puppet Custom Resource Types
  • Testing and Deploying Puppet Code
  • Hiera for Hierarchical Data Storage
  • Configure Hiera
  • Puppet Code
  • Dependency Management
  • PuppetDB
  • Extend Puppet with Custom Functions and Facts
  • Puppet Modules
  • Course Summary

Site Reliability Engineering: Apache ZooKeeper for Distributed Systems

Course: 1 Hour, 28 Minutes

  • Course Overview
  • Installing and Configuring a ZooKeeper Ensemble
  • Exploring ZooKeeper's Core Functionality
  • ZooKeeper Architecture and Data Model
  • Managing and Manipulating Data in ZooKeeper
  • Using ZooKeeper for Distributed Locks and Barriers
  • Configuring Service Discovery with ZooKeeper
  • Monitoring and Troubleshooting ZooKeeper Ensembles
  • Securing ZooKeeper with Authentication and Authorization
  • Optimizing ZooKeeper for Large-Scale Systems
  • Integrating ZooKeeper with Distributed Systems
  • Best Practices for Scaling ZooKeeper Environments
  • Implementing Distributed Locks with ZooKeeper
  • Course Summary

Site Reliability Engineering: Consul for Service Discovery and Configuration

Course: 1 Hour, 31 Minutes

  • Course Overview
  • Consul Architecture and Core Concepts
  • Deploying a Consul Cluster
  • Configuring a Consul Server with TLS and ACLs
  • Creating Consul Server Tokens
  • Configuring Consul Clients
  • Registering Services in Consul Catalog
  • Configuration with Consul KV Store
  • Service Segmentation with Consul Connect
  • Real-Time Updates with Consul Templates
  • Application Integration with Consul
  • Monitoring and Troubleshooting Consul Operations
  • Best Practices for Scaling and Securing Consul
  • Leveraging Dynamic Consul Templates
  • Course Summary

Site Reliability Engineering Toolbox: Cloud Services & Deployment Fundamentals

Course: 56 Minutes

  • Course Overview
  • Cloud Computing Basics and SRE’s Role in the Cloud
  • AWS EC2, S3, RDS, and CloudWatch Services
  • GCP Compute Engine, Storage, BigQuery, and Cloud Operations Services
  • Azure VMs, Blob Storage, and Monitor Services
  • Automated Deployments and Infrastructure as Code (IaC)
  • Multicloud Networking and Connectivity Strategies
  • Configuring Scalable and Resilient Cloud Storage
  • Managing Scalable and Resilient Cloud Storage
  • Course Summary

Site Reliability Engineering Toolbox: Advanced Multicloud Strategies and Best Practices

Course: 1 Hour, 50 Minutes

  • Course Overview
  • Cloud Security and Identity Access Management
  • Log Management and Monitoring Across Cloud Services
  • Cloud Cost Management and Optimization Techniques
  • AI/ML Services for Efficiency and Cost Optimization
  • Comprehensive Multicloud Strategy Development
  • Case Studies of Real-World Multicloud Strategies
  • Setting Up Cloud Automation for AWS and Terraform
  • Designing a High Availability (HA) Application with Failover
  • Deploying a High Availability (HA) Solution with Failover
  • Confirming the Resilience of a High Availability (HA) Solution with Failover
  • Replicating Data Between Providers via a Pipeline
  • Designing a Multi-Tier, Multicloud Solution for Deployment
  • Deploying a Multi-Tier, Multicloud Application
  • Designing a Monitoring Solution for a Multicloud App
  • Configuring a Monitoring Solution for a Multicloud App
  • Course Summary

Assessment: Final Exam: Site Reliability Engineering Tools

Track 4: Data Reliability Engineering

Learn how to build trustworthy, resilient and scalable data systems that support reliable analytics, operational maturity and data-driven decision-making.

  • Fundamentals of Data Reliability Engineering
  • Advanced Practices and Applications in Data Reliability Engineering
  • Core Tools for Data Reliability Engineering
  • Operational Excellence in Data Reliability Engineering
  • Strategic Foundations of Data Reliability Engineering
  • Engineering Scalable Data Reliability Systems

Fundamentals of Data Reliability Engineering

Course: 36 Minutes

  • Course Overview
  • Data Reliability Engineering (DRE)
  • DRE and SRE Roles and Responsibilities
  • Skills and Qualifications for DRE Careers
  • Principles of Data Reliability
  • Metrics and Tools for Ensuring Data Accuracy
  • Integrating DRE Practices into Workflows
  • Course Summary

Advanced Practices and Applications in Data Reliability Engineering

Course: 1 Hour, 4 Minutes

  • Course Overview
  • Data Reliability’s Impact on Business Decisions
  • Data Reliability Checks
  • Case Studies of Data Reliability Engineering
  • Challenges and Strategies in Data Reliability Engineering
  • Creating Data Quality Monitoring Dashboards
  • Using Distributed Locks for Data Consistency
  • Automating Data Validation in CI/CD Pipelines
  • Managing Data Schema Changes with Version Control
  • Course Summary

Core Tools for Data Reliability Engineering

Course: 50 Minutes

  • Course Overview
  • Data Observability and Monitoring Tools in Data Reliability Engineering (DRE)
  • Techniques for Assessing and Validating Data Quality
  • Incident Management Frameworks for Data Systems
  • Root Cause Analysis (RCA) for Data Reliability Issues
  • Continuous Improvement in Data Reliability
  • Comparing and Integrating Data Quality Frameworks
  • Course Summary

Operational Excellence in Data Reliability Engineering

Course: 1 Hour, 3 Minutes

  • Course Overview
  • Detect and Prevent Common Data Anomalies
  • Automate Alerts for Data Reliability Metrics
  • Create Documentation for Data Systems
  • Best Practices for Collaboration in Data Reliability
  • Building Data Observability Dashboards with Grafana
  • Conducting Data Quality Assessments Using Frameworks
  • Simulating Incidents and Root Cause Analysis Processes
  • Implementing Strategies for Continuous Data Improvement
  • Course Summary

Engineering Scalable Data Reliability Systems

Course: 56 Minutes

  • Course Overview
  • Designing Pipeline Data Reliability Checks
  • Strategies for Ongoing Data Quality Improvement
  • Reliability Check Automation for Scalability
  • Evaluating ROI and Impact of DRE Initiatives
  • Creating a Data Governance Framework
  • Implementing Security in Data Systems
  • Performing Pipeline Data Quality Checks
  • Course Summary

Strategic Foundations of Data Reliability Engineering

Course: 36 Minutes

  • Course Overview
  • Building a Comprehensive Data Reliability Framework
  • Aligning Data Governance with Reliability Goals
  • Establishing Data Security and Compliance Protocols
  • Integrating DRE into Data Engineering Workflows
  • Using Metrics and KPIs for Data Reliability
  • Course Summary

Assessment: Final Exam: Data Reliability Engineering

Who should attend?

This training is suitable for system administrators, DevOps engineers, SREs, operations professionals, platform engineers, data engineers and technical teams that want to improve reliability, scalability and operational quality.

Outcome

After completion, you can apply SRE and DRE methodologies, improve monitoring and alerting, handle incidents professionally, support deployment processes and build reliable data and cloud environments.

Why choose OEM Office Elearning Menu?

  • Self-paced online ICT training
  • Practical course content with clear learning objectives
  • Suitable for professionals, teams and organizations
  • Focused on modern IT, cloud, AI and automation skills
  • Supports career growth, certification preparation and professional development

Start the SRE/DRE Toolbox Training and build the skills to make modern systems more reliable, scalable and manageable.

Specifications

Article number
163407346
SKU
163407346
Language
English
Qualifications of the Instructor
Certified
Course Format and Length
Teaching videos with subtitles, interactive elements and assignments and tests
Lesson duration
29:21 Hours
Assesments
The assessment tests your knowledge and application skills of the topics in the learning pathway. It is available 365 days after activation.
Online Virtuele labs
Receive 12 months of access to virtual labs corresponding to traditional course configuration. Active for 365 days after activation, availability varies by Training
Online mentor
You will have 24/7 access to an online mentor for all your specific technical questions on the study topic. The online mentor is available 365 days after activation, depending on the chosen Learning Kit.
Progress monitoring
Access to Material
365 days
Technical Requirements
Computer or mobile device, Stable internet connections Web browsersuch as Chrome, Firefox, Safari or Edge.
Support or Assistance
Helpdesk and online knowledge base 24/7
Certification
Certificate of participation in PDF format
Price and costs
Course price at no extra cost
Cancellation policy and money-back guarantee
We assess this on a case-by-case basis
Award Winning E-learning
Tip!
Provide a quiet learning environment, time and motivation, audio equipment such as headphones or speakers for audio, account information such as login details to access the e-learning platform.

Reviews

0/5
0 stars based on 0 reviews
0 reviews
Vragen over dit product?
Heeft u vragen over dit product of hulp nodig bij het bestellen? Onze AI-chatbot is 24/7 beschikbaar, of neem contact op via [email protected] of bel +31 36 760 1019
Vragen over dit product?
Heeft u vragen over dit product of hulp nodig bij het bestellen? Onze AI-chatbot is 24/7 beschikbaar, of neem contact op via [email protected] of bel +31 36 760 1019

Recently viewed

LearnKit
OEM Site Reliability Engineering/Data Reliability Engineering (SRE/DRE) Toolbox Training
OEM
Site Reliability Engineering/Data Reliability Engineering (SRE/DRE) Toolbox Training
Learn to build and manage reliable IT and data systems with SRE and DRE. This IC...
€239,58 €198,00

Specifications

Article number
163407346
SKU
163407346
Language
English
Qualifications of the Instructor
Certified
Course Format and Length
Teaching videos with subtitles, interactive elements and assignments and tests
Lesson duration
29:21 Hours
Assesments
The assessment tests your knowledge and application skills of the topics in the learning pathway. It is available 365 days after activation.
Online Virtuele labs
Receive 12 months of access to virtual labs corresponding to traditional course configuration. Active for 365 days after activation, availability varies by Training
Online mentor
You will have 24/7 access to an online mentor for all your specific technical questions on the study topic. The online mentor is available 365 days after activation, depending on the chosen Learning Kit.
Progress monitoring
Access to Material
365 days
Technical Requirements
Computer or mobile device, Stable internet connections Web browsersuch as Chrome, Firefox, Safari or Edge.
Support or Assistance
Helpdesk and online knowledge base 24/7
Certification
Certificate of participation in PDF format
Price and costs
Course price at no extra cost
Cancellation policy and money-back guarantee
We assess this on a case-by-case basis
Award Winning E-learning
Tip!
Provide a quiet learning environment, time and motivation, audio equipment such as headphones or speakers for audio, account information such as login details to access the e-learning platform.
0/5
0 stars based on 0 reviews
0 reviews
Choose your language
Choose your currency

Recently added

Total excl. VAT
€0,00
Order for another €0,00 and receive free shipping
0
Compare
Start comparison

Review OEM SRE DRE Toolbox Training

This product has been added to your cart
Wij gebruiken functionele en analytische cookies (Google Analytics). Geen persoonsgegevens voor advertenties. Kies hieronder of beheer uw voorkeuren. Manage cookies