Senior Data Engineer, Senior Data Consultant, Senior Data Architect

Kelkheim, Deutschland

Weltweit

Master Computer Science (Data Science)

Kelkheim, Deutschland

Weltweit

Master Computer Science (Data Science)

Profilanlagen

Über mich

Erfahrener Data Engineer & Consultant mit Schwerpunkt SQL, Python, ADF, Datenmodellierung und Power BI. Spezialisiert auf Data-Lake-Architekturen, ETL-Prozesse und Cloud-Plattformen (Azure, AWS, Snowflake, Databricks).

Skills

AbfrageleistungSql Data WarehouseJavaSap Sybase Adaptive Server EnterpriseAPIsAgile MethodologieAirflowAmazon Web ServicesAmazon S3Data AnalysisMicrosoft AzureBig DataUNIXCloud ComputingProfilingComputerprogrammierungDatenbankenData ArchitectureInformation EngineeringETLData MartData Transformation ServicesData WarehousingRelationale DatenbankenDatenbankentwicklungAmazon DynamodbPerlEreignisgesteuerte ProgrammierungGithubGraphdatenbankApache HadoopHadoop Distributed File SystemMapReduceApache HbaseApache HiveJsonPythonPostgresqlUnix-ShellLogische DatenmodelleMcafee VirusscanMicrosoft Sql-ServerSql AzureMysqlNeo4JNosqlOracle FinancialsPerformance-TuningAbfrageoptimierungPower BiAzure Data LakeSAP ApplicationsScalaStored ProcedurePL/SQLSQLSQL Server Reporting ServicesSQL Server Integration ServicesStreamingSubversionTransact-SqlExtensible Markup LanguageZookeeperParquetDatenverarbeitungTest-Driven DevelopmentApache YarnAzure Data FactorySnowflakeData Build Tool (dbt)Apache SparkSybaseDatenbankleistungAws LambdaIndexerGitCloudformationData LakePysparkKubernetesApache FlinkCassandraVBA Programming LanguageAWS GlueEchtzeitdatenApache KafkaPrestoPerfmonCloudwatchSoftware Version ControlDaten-PipelineAmazon Elastic Mapreduce (EMR)DockerJenkinsAmazon RedshiftDatabricks

AWS Glue, AWS Lambda, agile methodology, Airflow, Apache Airflow, DynamoDB, AWS EMR, Redshift, Amazon Redshift, AWS Redshift, AWS S3, S3, AWS, Flink, Hbase, Apache Hadoop, Hadoop, Hive, Kafka, Apache Spark, Spark, YARN, Zookeeper, API, Azure Data Factory, Azure Data Lake, big data technologies, big data, Cassandra, cloud technologies, AWS CloudFormation, CloudWatch, programming, data analytics, data architect, data lakes, data lake, Data mart, data pipelines, data pipeline, process data, streaming, DTS packages, data warehousing, Data Warehouse architecture, Data warehouse, Database Development, Database Performance, database, database systems, databases, Databricks, Docker, event driven, xml, ETL, ETL tools, ETL development, Git, Github, graph database, HDFS, indexing, data engineering, JSON data, json, Java, Java 8, Jenkins, Kubernetes, Logical database design, Map Reduce, McAfee, MS Access, Azure, Azure cloud, Azure Logic Apps, MS SQL Server, SQL server, SQL Server 2000, SQL Server 2008, MySQL, Neo4J, NoSQL, Oracle, PL/SQL, Parquet, Perfmon, performance tuning, Perl, Postgres, Postgresql, Power BI, PowerBI, Presto, Profiler, pyspark, Python, query optimization, query performance, real time data, RDBMS, SAP, Sybase ASE, Sybase as, Azure SQL, SQL, SQL queries, SSIS, SSRS, stored procedure, stored procedures, Scala, Snowflake, version control, SVN, Sybase, Test Driven development, T-SQL, Unix, Unix shell scripting, UNIX shell, VBA

Sprachen

DeutschgutEnglischverhandlungssicher

Projekthistorie

Senior Data Engineer

Adevinta

Internet und Informationstechnologie

500-1000 Mitarbeiter

Senior Data Engineer

Sanofi

Description:
* Designed , Architected and developed Medallion architecture using Snowflake to process data
from SAP for Customer Care team.
* Designed and developed ETL pipeline to process SAP data and transformed and stored in AWS
Athena data lake.
* Created BI model to feed into the Power BI from AWS Athena and Snowflake.
* Worked on migrating the transformation for Snowflake in DBT to create the Silver and Golden
layer.
* Created the Airflow DAG's for running DBT.

Senior Data Engineer

Hilti

Description:
* Rearchitected the system to make it more efficient using Databricks and Athena.
* Designed and developed ETL pipeline to extract and transform data on S3 using Databricks,DBT,
Athena and Airflow.
* Created efficient SQL queries against AWS redshift/ Databricks SQL and Athena to extract and
transform data.
* Used Databricks DLT pipeline for incremental load.
* Redesigned the Transformation logic from Redshift, Databricks/Apache Spark/pyspark and AWS
Athena. This improved the performance as well as cost on redshift.
* Created the Pyspark scripts for most complex and compute heavy transformations.
* Created the AWS Athena queries for simpler and smaller transformations.
* Used Apache Airflow to orchestrate end to end data pipeline to extract data in S3,
transformation in Databricks, athena and load to Redshift.
* Created the MV for most of the important reports.
* Developed the AWS lambda in python.
* Developed Queries in AWS Athena for reporting.
* Used DBT to create the DWH silver and gold layer using the databricks.

Kontaktanfrage

Einloggen & anfragen.

Das Kontaktformular ist nur für eingeloggte Nutzer verfügbar.

Registrieren Anmelden