Data Engineer / Architect

Sulzbach, Deutschland

Weltweit

Master of Science Business Informatics (Focus: Data Mining)

Sulzbach, Deutschland

Weltweit

Master of Science Business Informatics (Focus: Data Mining)

Profilanlagen

Skills

Scala Python Apache Spark AWS (Amazon Web Services)Hadoop GCP (Google Cloud Platform)Java Microsoft Azure Big Data

Programming languages
- Python
- Scala
- Java
- JavaScript
- SQL
- (C#, R)
Framework/Tools
- AWS S3, Sagemaker, Elastic Container Service
- Google Cloud Platform BigQuery, Dataproc, Cloud Composer, Cloud Storage
- Azure Data Lake Storage Gen2, Data Factory, DevOps, Databricks, SQL Database, Machine Learning Studio
- Apache Spark / PySpark / MLlib / Structured Streaming
- Databricks / Delta Lake
- Docker
- Kubernetes
- Terraform
- Apache Hive
- Apache Kafka
- Apache Airflow
- HDFS
- scikit-learn, pandas
- PyTorch
- JupyterLab, Jupyter Notebook, Apache Zeppelin
- SSIS
- React, React Native
- Django
Databases
- PostgreSQL
- MongoDB
- MS SQL Server

Data Engineering & Machine Learning Expert | Cloud Specialist | Full-Stack Developer

With over 8 years of experience as a technical consultant and freelancer, I specialize in designing and deploying production-ready Machine Learning pipelines and Data Lakes across top cloud platforms, including GCP, AWS, and Azure.

Proven Results: Successfully developed an advanced dashboard and reporting system to track carbon emissions, boosting data transparency and decision-making for stakeholders (2024).
Versatile Expertise: Led the delivery of two high-impact, data-driven products by seamlessly integrating data engineering with machine learning, providing clients with reliable predictive analytics (2022-2023).
Cloud Migration Success: Completed a full-scale migration to Google Cloud Platform (GCP) within a year, ensuring smooth transitions and system optimization (2021).
End-to-End Solutions: I bring proficiency in full-stack development, enabling me to contribute to every stage of the software development process with agility and precision.

Trainings and Certificates

AWS - Data Engineer (2024)
Convolutional Neural Networks with Pytorch
- Udemy - 09.2021
Full week Apache Spark Training on Databricks
- Databricks - 01.2020
Udacity Deep Learning Nanodegree
- Udacity - 07.2019
CRT010: Databricks Certified Developer: Apache Spark 2.X with Scala
- Databricks - 02.2019
3 full days Scrum training
- inovex GmbH - 03.2018

Additional experience from private projects

Boda Wedding Web-App > see https://boda-app.de/
React Native App -> see https://mbrissier.github.io/scood/
I developed a ML pipeline able to extract nutritional facts from nutrition table pictures with an accuracy of 70 percent. Model: Convolutional Neural Network, Frameworks: PyTorch and PyTorch Mobile

Sprachen

DeutschMutterspracheEnglischverhandlungssicherSpanischgut

Projekthistorie

Data Engineer

Biggest german car marketplace

Internet und Informationstechnologie

250-500 Mitarbeiter

Migration to (Google Cloud Platform) GCP

Responsible for the technical migration to GCP, coming from an on-premise Hadoop environment with multiple Apache Spark (Scala) ETL jobs.
Finding architectural solutions for the following challenges and unknowns:
- Dynamic partition overwrite with BigQuery
- Connecting GCP to on-premise data stores like Kafka and Mongo Databases
- Scheduling jobs in GCP with Cloud Composer (Apache Airflow)
- Implementing deployment and testing in GCP
- Introducing Terraform for a collaborative management of infrastructure as code
Reengineering of all on-premise Apache Spark (Scala) ETL jobs and Jenkins pipelines for a GCP migration.
- Creating Scala sbt build scripts for multiple projects and Jenkins pipelines for versioning and deployment of fat JARs and Airflow DAGs.
Orchestrating and executing the migration phase with various dependencies between Spark jobs, departments as data consumers and newly developed features by the team.
Introducing the new GCP setup to the developer team.

Big Data Scientist

inovex GmbH

Internet und Informationstechnologie

250-500 Mitarbeiter

Technical Consultant for Big Data projects.

At inovex I have been working as a technical consultant for different customers in the roles of Data Engineer or Machine Learning Engineer.
I worked on various projects with different technology stacks in on-premises and cloud environments.
I developed most of my skills in and around the following stacks and environments: Hadoop Ecosystem (especially Apache Spark), SciPy, Google Cloud Platform, AWS, Azure, on-premises clusters.
Please see the project history for more details. All real company names of the customers are restricted by a NDA.

Machine Learning Engineer

Biggest german private tv channel

Medien und Verlage

>10.000 Mitarbeiter

Creating a Data Science Process Framework

Evaluating different ML Frameworks for Experiment Tracking and Model Serving (e.g. MLFlow, Neptune, …).
Creating a custom Cookiecutter Data Science project template for a better project structure, git integration, python packaging, python environment management and secret management.
Setting up a private PyPI repository for Data Science project packages in (Google Cloud Platform) GCP.
Development of a central python api for an easier integration and deployment of ML artifacts to GCP - especially for GCS and BiqQuery.
Elaborating a Data Science process framework and establishing it with different Data Science departments in combination with the developed artifacts python api, PyPI repository, Data Science project template and selected ML Frameworks.

Kontaktanfrage

Einloggen & anfragen.

Das Kontaktformular ist nur für eingeloggte Nutzer verfügbar.

Registrieren Anmelden