18.09.2025 aktualisiert


Premiumkunde
nicht verfügbarData Engineer / Architect
Sulzbach, Deutschland
Weltweit
Master of Science Business Informatics (Focus: Data Mining)Skills
ScalaPythonApache SparkAWS (Amazon Web Services)HadoopGCP (Google Cloud Platform)JavaMicrosoft AzureBig Data
- Programming languages
- Python
- Scala
- Java
- JavaScript
- SQL
- (C#, R)
- Framework/Tools
- AWS S3, Sagemaker, Elastic Container Service
- Google Cloud Platform BigQuery, Dataproc, Cloud Composer, Cloud Storage
- Azure Data Lake Storage Gen2, Data Factory, DevOps, Databricks, SQL Database, Machine Learning Studio
- Apache Spark / PySpark / MLlib / Structured Streaming
- Databricks / Delta Lake
- Docker
- Kubernetes
- Terraform
- Apache Hive
- Apache Kafka
- Apache Airflow
- HDFS
- scikit-learn, pandas
- PyTorch
- JupyterLab, Jupyter Notebook, Apache Zeppelin
- SSIS
- React, React Native
- Django
- Databases
- PostgreSQL
- MongoDB
- MS SQL Server
Data Engineering & Machine Learning Expert | Cloud Specialist | Full-Stack Developer
With over 8 years of experience as a technical consultant and freelancer, I specialize in designing and deploying production-ready Machine Learning pipelines and Data Lakes across top cloud platforms, including GCP, AWS, and Azure.
- Proven Results: Successfully developed an advanced dashboard and reporting system to track carbon emissions, boosting data transparency and decision-making for stakeholders (2024).
- Versatile Expertise: Led the delivery of two high-impact, data-driven products by seamlessly integrating data engineering with machine learning, providing clients with reliable predictive analytics (2022-2023).
- Cloud Migration Success: Completed a full-scale migration to Google Cloud Platform (GCP) within a year, ensuring smooth transitions and system optimization (2021).
- End-to-End Solutions: I bring proficiency in full-stack development, enabling me to contribute to every stage of the software development process with agility and precision.
Trainings and Certificates
- AWS - Data Engineer (2024)
- Convolutional Neural Networks with Pytorch
- Udemy - 09.2021
- Full week Apache Spark Training on Databricks
- Databricks - 01.2020
- Udacity Deep Learning Nanodegree
- Udacity - 07.2019
- CRT010: Databricks Certified Developer: Apache Spark 2.X with Scala
- Databricks - 02.2019
- 3 full days Scrum training
- inovex GmbH - 03.2018
Additional experience from private projects
- Boda Wedding Web-App > see https://boda-app.de/
- React Native App -> see https://mbrissier.github.io/scood/
- I developed a ML pipeline able to extract nutritional facts from nutrition table pictures with an accuracy of 70 percent. Model: Convolutional Neural Network, Frameworks: PyTorch and PyTorch Mobile
Sprachen
DeutschMutterspracheEnglischverhandlungssicherSpanischgut
Projekthistorie
Migration to (Google Cloud Platform) GCP
- Responsible for the technical migration to GCP, coming from an on-premise Hadoop environment with multiple Apache Spark (Scala) ETL jobs.
- Finding architectural solutions for the following challenges and unknowns:
- Dynamic partition overwrite with BigQuery
- Connecting GCP to on-premise data stores like Kafka and Mongo Databases
- Scheduling jobs in GCP with Cloud Composer (Apache Airflow)
- Implementing deployment and testing in GCP
- Introducing Terraform for a collaborative management of infrastructure as code
- Reengineering of all on-premise Apache Spark (Scala) ETL jobs and Jenkins pipelines for a GCP migration.
- Creating Scala sbt build scripts for multiple projects and Jenkins pipelines for versioning and deployment of fat JARs and Airflow DAGs.
- Orchestrating and executing the migration phase with various dependencies between Spark jobs, departments as data consumers and newly developed features by the team.
- Introducing the new GCP setup to the developer team.
Technical Consultant for Big Data projects.
- At inovex I have been working as a technical consultant for different customers in the roles of Data Engineer or Machine Learning Engineer.
- I worked on various projects with different technology stacks in on-premises and cloud environments.
- I developed most of my skills in and around the following stacks and environments: Hadoop Ecosystem (especially Apache Spark), SciPy, Google Cloud Platform, AWS, Azure, on-premises clusters.
- Please see the project history for more details. All real company names of the customers are restricted by a NDA.
Creating a Data Science Process Framework
- Evaluating different ML Frameworks for Experiment Tracking and Model Serving (e.g. MLFlow, Neptune, …).
- Creating a custom Cookiecutter Data Science project template for a better project structure, git integration, python packaging, python environment management and secret management.
- Setting up a private PyPI repository for Data Science project packages in (Google Cloud Platform) GCP.
- Development of a central python api for an easier integration and deployment of ML artifacts to GCP - especially for GCS and BiqQuery.
- Elaborating a Data Science process framework and establishing it with different Data Science departments in combination with the developed artifacts python api, PyPI repository, Data Science project template and selected ML Frameworks.