31.10.2025 aktualisiert


100 % verfügbar
Data Engineer, Engineering intern
Berlin, Deutschland
Deutschland
Skills
Docker, Kubernetes, AWS, S3, PostgreSQL, SQL, Power BI, Google, clustering, PySpark, algorithms, Spark, search engine, 3D models, 3D printing, Web crawling, multithreading, multiprocessing, ETL, database, Airflow, web scraping, Python, large data sets, algorithm, DNN, Keras, VBA, MS Excel, SVM, Azure
Sprachen
DeutschgutEnglischverhandlungssicher
Projekthistorie
At FetchCFD.com Berlin
* Code implementation and deployment using Docker and
Kubernetes
* Extract, transform and load structured and unstructured data from
various data sources using AWS S3 and PostgreSQL
* Construction of data pipelines to enable advance business
intelligence systems using SQL, Power BI, Google analytics
* Processing (clustering) of user data based on unique patterns using
PySpark
* Using PySpark to run personalization algorithms to improve user
engagement and time on platform
* Tools used: PostgreSQL, Docker, SQL, AWS, Kubernetes, Spark
* Code implementation and deployment using Docker and
Kubernetes
* Extract, transform and load structured and unstructured data from
various data sources using AWS S3 and PostgreSQL
* Construction of data pipelines to enable advance business
intelligence systems using SQL, Power BI, Google analytics
* Processing (clustering) of user data based on unique patterns using
PySpark
* Using PySpark to run personalization algorithms to improve user
engagement and time on platform
* Tools used: PostgreSQL, Docker, SQL, AWS, Kubernetes, Spark
At Friedrich-Alexander University Erlangen
* Working and processing large data sets (several TB) and performed
complex qualitative analysis to predict patterns inside industrial
columns
* Distributing data and running analysis across multiple computers
on a cluster to predict the maximum time a building can be kept
heated from the ground energy
* Developed predictive algorithm to estimate the optimal
temperature gradient and its corresponding response time for
large data sets of automobile manufacturing process
* Tools used: DNN, Keras, Python, Parallel Computing, Cluster,
Genetic Algorithms
* Working and processing large data sets (several TB) and performed
complex qualitative analysis to predict patterns inside industrial
columns
* Distributing data and running analysis across multiple computers
on a cluster to predict the maximum time a building can be kept
heated from the ground energy
* Developed predictive algorithm to estimate the optimal
temperature gradient and its corresponding response time for
large data sets of automobile manufacturing process
* Tools used: DNN, Keras, Python, Parallel Computing, Cluster,
Genetic Algorithms
At BASF SE Ludwigshafen
* Modeling of flow through sharp-edged orifice using VBA in MS
Excel for spray applications
* Development of functions and expression in MS Excel capable of
calculating the pressure loss through orifice plates with inclined
inlet flow or inlet swirl
* Tools used: MS Excel, VBA
Libraries & Tools:
Python | Spark |Scikit-
Learn | JupyterLab | SQL | Docker
Algorithms:
DNN | Random Forests | SVM | Time
Series Analysis | Clustering | XGBoost
* Modeling of flow through sharp-edged orifice using VBA in MS
Excel for spray applications
* Development of functions and expression in MS Excel capable of
calculating the pressure loss through orifice plates with inclined
inlet flow or inlet swirl
* Tools used: MS Excel, VBA
Libraries & Tools:
Python | Spark |Scikit-
Learn | JupyterLab | SQL | Docker
Algorithms:
DNN | Random Forests | SVM | Time
Series Analysis | Clustering | XGBoost