08.07.2025 aktualisiert

SK
100 % verfügbar

Database / Site Reliability Engineer

München, Deutschland Bachelor-Abschluss
München, Deutschland Bachelor-Abschluss

Skills

Everything is a trade off in system operations.

Over the past 18 years I have been using the following techologies / softwares: 
chef, puppet, ansible, cfengine, terraform, packer configuration management. mcollective orchestration, devops attitude.
jenkins, gitlab-ci, bamboo build automation,
pfsense, brocade zxtm, amazon web services, elastic load balancers,
xymon, nagios, icinga2, cacti, smokeping, pingdom, TIG (telegraf, influxdb grafana), TICK (telegraf influxdb chronograf kapacitor)
elasticsearch, kafka, zookeeper, mssql, mysql, postgresql, cratedb, aurora, redis.
aws ec2, openstack, virtualbox, kvm, ovirt, vagrant, terrraform.
debian / ubuntu, but Redhat / CentOS are preffered OS.
ipsec / racoon / strongswan / pfsense

Sprachen

DeutschGrundkenntnisseEnglischMuttersprache

Projekthistorie

Lead Database Reliability Engineer

Private

Internet und Informationstechnologie

5000-10.000 Mitarbeiter

Database operations for a leading cyber security company using ansible, terraform, packer, jenkins, on AWS.
Documenting deployments, backups, restores, ticketing on Jira.
Development of elasticsearch maintenance pipelines, ec2 instance retirement notices and replacements, elasticsearch upgrades, backups, restores.
Development of elasticsearch monitoring and alerting using telegraf, influxdb, grafana.
Maintaining 100 elasticsearch clusters, 2400 instances total on AWS, not the managed Amazon Elasticsearch Service.
Assisting teams in migrating, updating, ES versions. Assisting teams in ES best practices, feeding back improvements to the teams.
Cost saving initiatives - scaling clusters up and down relative to their actual usage.
Maintaining/automating kafka / zookeeper clusters.
Development of kafka maintenance pipelines, ec2 instance retirement notices and pipelined node replacements.
Maintaining redis stacks, node failover, replacement, maintenance with Jenkins.
Maintaining/automating ScyllaDB clusters.
Maintaining the jenkins environment on kubernetes.
Maintaining / deploying the frontends for Kibana, Dashboards (opensearch), akhq, cassandra reaper on kubernetes.
Implemented "s3 bootloader" for OS boot drive flashing on AWS, super fast OS replacement - data disks left intact.
Maintaining various AWS aurora/postgres stacks with terraform and jenkins.

Lead Site Reliability Engineer

Private

Internet und Informationstechnologie

1000-5000 Mitarbeiter

January 2018 - July 2019
Management / maintenance of 250 misc Linux servers baremetals/vms.
Managing software deployments, supporting new developer requests, creating development/automation pipelines.
Migrating from “cdist” (bash/scp) automation to ansible.
Building automation and deployment pipelines with gitlab, terraform, packer, ansible on AWS and OCI.
Planning and groundwork for upgrades from CentOS-5,6 and Debian/Ubuntu* to CentOS 7.
Initial deployment of development and production environments in AWS VPCs using terraform modules.
Creation of ipsec mesh network using ansible to migrate old datacenters and servers to new AWS VPCs.

Lead Devops Engineer

Travian Games GmbH

Internet und Informationstechnologie

50-250 Mitarbeiter

Management / maintenance of 1700 CentOS 6 vms.
CentOS, git, nginx, php, MySQL, nagios, icinga2, grafana, F5, bacula, vmware, aws, bamboo, puppet, vagrant stack.
Planning and groundwork for initial upgrades of core systems to CentOS 7.
Migrating several custom inhouse solutions (bash,python,php, service-now) to puppet 4 and hiera.
Deployment of new datacenter for new game project
Icinga2 deployment with puppetdb driven host/service detection/collection.
Improving vagrant environment to support and test new tools.
Interviewed 20 applicants and hired 3 staff members to bring the permanent team back to 9 members.

Kontaktanfrage

Einloggen & anfragen.

Das Kontaktformular ist nur für eingeloggte Nutzer verfügbar.

RegistrierenAnmelden