08.09.2025 aktualisiert

**** ******** ****
100 % verfügbar

Data Scientist

Munich, Deutschland Promotion im Bereich der angewandten Mathematik
Munich, Deutschland Promotion im Bereich der angewandten Mathematik

Profilanlagen

Software development
Lebenslauf.pdf

Skills


Mathematik
  • Statistik: Lineare Regression, Logistische Regression, Elastic-net, Zeitreihenanal
  • Maschinelles Lernen: Entscheidungsbaum, Random Forest, Support Vector Machine, clustering, Na?chste-Nachbarn-Klassifikation, Bagging, Cross-Validation
  • Data Mining: priori, frequent itemset mining, frequent sequence mining
IT-skills

R[10 years] Paketenentwicklung 
Shiny[3 years] Taught best practices
Git[3 years] Systematic use of BitBucket
SQL[2 years] Basic ETL

Sprachen

DeutschverhandlungssicherEnglischverhandlungssicherFranzösischMutterspracheSpanischGrundkenntnisse

Projekthistorie

Development of a R based ETL solution to replace Microsoft SSIS

Versicherungen

>10.000 Mitarbeiter

At the heart of the reinsurance business lies the transfer of risk and premium fees.

  1. Insurance takers such as individuals insuring their cars and businesses insuring their stocks, transfer their risks to insurers against premiums.

  2. Insurers themselves transfer their risks to reinsurers against a premium.

  3. Reinsurers transfer their risks to financial markets for a premium.

The transfers manifest as a constant flow of data to and from reinsurers.

 

The client,  the international life reinsurance business unit of an international reinsurer from Bavaria, processes data from various sources such as first insurers, market quote providers and trade simulators on a daily basis. Microsoft SQL Server Integration Services (SSIS) used to be the linchpin of the “Extract Transform Load” (ETL) process part. Unfortunately,  SSIS based ETL proved hard to develop, maintain and update. Furthermore, Microsoft can stop supporting a given version of SSIS at any time and force the business unit to transition to a non backward compatible version of SSIS. As a result, few collaborators could use SSIS, which became a bottle neck. I was hired to develop a  R based  replacement to SSIS so as to tackle those problems.

 

The R based solution I developed replaced all the SSIS functionalities required and brought about the desired non-functional requirement and more:

  • Any team member can quickly and independently implement an ETL process

  • ETL processes are no longer a bottle neck

  • The business unit owns the system and evolves it as they see fit

  • New processes can be automated so as to free up time for the collaborators

SSIS is now decommissioned and the R based solution I developed is the new linchpin:  it saves time, money and headaches.

  Technicalities

The data import framework I developed consists of 5 R packages that are

  • structured as per Hadley Wickham advices

  • documented individually via roxygen2

  • documented globally via bookdown

  • unit tested via testthat

  • version controlled via git hosted on Azure DevOps

  • released in a private R package repository.


Replication of a chemical compound

Industrie und Maschinenbau

>10.000 Mitarbeiter

The properties of a chemical compound depend on its molecular composition. When the compound is expensive, rare or difficult to procure, one may want to replicate its composition by mixing available precursors. The difficulty is creating a mix that appropriately balances the properties and the price of the replicate.

 

The chemical company relies on the subject matter experts. Because expertise is long to acquire, difficult to share and time consuming to apply, the chemical company was looking for a way to automate as much as possible of the process. The experts would then focus on the most valuable. My task was to enable the experts to start from a reasonable data driven replicate.    

 

I proceeded so.

  1. I discussed the chemistry with experts.

  2. I modelled the problem and embedded the solution into a graphical interface.

  3. I included the chemical compound editing tool experts use into the graphical interface.

 

The tool saves the chemists a lot of time by providing them with a very good basis. They later refine it via the graphical interface and practical experiments.

  Technical tools  
  • Mathematics: Linear programming optimisation

  • Data analysis: R

  • Visualisation: R-Shiny

  • Data source: Excel files


Prediction of debt collection over 180 month

Banken und Finanzdienstleistungen

In the debt collection business, predicting future collection is of crucial importance when pricing debt portfolios, budgeting resources and acquiring clients. The debt collection company hired my services to replace their time consuming manual process with a data driven prediction tool that any member of the controlling department should be able to use.

 

I created the prediction tool in collaboration with the control agents as follows.

  1. The controllers taught me the basics of their business.

  2. The head data engineer introduced me to their information system and database.

  3. I developed a flexible model of how debts are collected and developed ways to test for it.

  4. I developed a graphical user interface so that the controllers could start the prediction process until it is fully integrated into the monthly IT processes.

 

Predicting debt collection used to take the whole team 6 weeks. It now takes 3 hours plus some manual adjustment.

  Technical tools
  • Mathematics: Modelling and quantile regression

  • Data analysis: R

  • Visualisation: Shiny

  • ETL: Oracle


Kontaktanfrage

Einloggen & anfragen.

Das Kontaktformular ist nur für eingeloggte Nutzer verfügbar.

RegistrierenAnmelden