24.11.2025 aktualisiert


Premiumkunde
100 % verfügbarDatabricks Architect, Data Engineer, Lakehouse Architect, PySpark
Frankfurt am Main, Deutschland
Deutschland
Bachelor Degree. Applied Informatics and Mathematics. Probability Theory and Statistics.Skills
Expert in data engineering (Azure, Databricks, Spark, Hadoop, Kafka, AWS), software development (Java, Scala, Python), and DevOps (Terraform, Azure DevOps, AWS, and Gitlab CI/CD).
Proven track of successfully completed projects with demonstrated ability to provide highly efficient solutions in accordance with the well-architected principles and best practices (SOLID, TDD, CI/CD, Agile). Over a dozen of successfully architected and improved data analytics solutions, including 10+ projects with Databricks.
I have the following certifications and constantly upgrade my knowledge:
- Azure Solutions Architect Expert
- Azure Data Engineer Associate
- AWS Certified Solution Architect Associate
- AWS Data Analytics Specialty.
Sprachen
DeutschverhandlungssicherEnglischverhandlungssicherRussischMuttersprache
Projekthistorie
Customer 1:
Development and operation of the Enterprise Data Lake (corporate Data Lakehouse) and Data Science Lab (reusable and secure solution, deployed for about 70 clients) automating infrastructure tasks and CI/CD pipelines.
- Azure Databricks, Azure AD, Azure DevOps, ADLS Gen.2, Azure Data Factory, Azure Machine Learning, MLflow.
Improved DevOps pipelines and ML infrastructure for the commodity pricing prediction project.
- Azure DevOps, Kubernetes, Terraform, Azure Databricks, Spark, Azure SQL, Power BI, Grafana.
Developed the cost-efficient Advanced Cybersecurity Analytics solution that consumes large volumes of highly sensitive data and transforms it in real-time.
- Kubernetes, Terraform, Azure DevOps, Confluent, Kafka, Logstash, Java, Rust.
Architecture and development of 3 data integration projects (B2B platform for cosmetics industry, production sites data analytics, operational data warehouse)
- Azure Databricks, PySpark, Kafka, dbx, Azure Data Factory, Azure SQL.
Customer 2:
Development of the corporate Data Lakehouse based on Azure Databricks, PySpark, Python, Airflow, Unity Catalog, dbx.
Development of the streaming data ingestion and processing using Kafka and Spark Structured Streaming.
Customer 3:
Building end-to-end data pipeline for the support system for medical diagnosis, architecting and developing the complete MLOps solution on Databricks and Google Cloud.
Optimizing performance for real time data processing.
- Azure Databricks, PySpark, GCP, MLflow, Streamlit.
Consulting one of the largest insurance companies in Germany to build Data Lake Cloud Concept based on Azure Databricks with a strong DevOps focus:
- Databricks, Azure Data Factory, ADLS Gen.2, Azure AD
- Documenting the concepts: DevOps, IAM, Cryptography, Business Continuity, Auditing, Logging. Building PoC for Terraform environments.
Interim Product Owner / Team Lead in the international IIOT project (GAIA-X):
- Docker, Kubernetes, Java, Spring Boot, Apache Camel
- Agile, Scrum.
Architecting the data integration framework on AWS with a strong focus on the best data management and DevOps practices:
- Modularized and extensible implementation
- Support of the most complex use cases in data management: Change Data Capture, Slowly Changed Dimension Type 2, schema changes
- Auditing and data lineage
- Support for CI/CD with test-driven development
- Written in Python, PySpark, Pandas, Jinja2 templates, using Airflow to orchestrate ETL and maintenance jobs.