Slawomir Tulski

Full-Stack Data Engineer

Call Center Efficiency Monitoring System

In following post monitoring system for fictitious call center will be shown. The designed environment supports tracking of important operational metrics, real-time employees performance monitoring and offline analysis used for further improvements in processes.

Full-stack Data Engineering pipeline

The following material was prepared to present full project which utilizes a lot of crucial skills for data engineers. It starts with the creation of a made-up operational system and dimensional model for DWH. Next, full-blown ETL process to populate DWH is implemented. Finally, main KPIs are chosen and visualized on the dashboard.

Testing Hadoop/Hive and Redshift as ETL(ELT) tool

Data transformation is most complex part of processes called ETL, which stand for Extract, Transform and Load. In general, goal of ETL job is to take data in certain form from system A and move it in desirable form into system B. In this post I will describe tests conducted to assess Hadoop/Hive and Redshift stacks as ETL toolkit.

Comparing listening of music: 2008 vs 2016

Having account at since 2005, now I have archive with more than 162 thousands of tracks recorded. I decided to take a little bit closer look at what I have listened to. For comparison, I took two time periods: year 2008 and 2016...

Price movement prediction using decision tree

Traditionally, there have been three major approaches in trading – fundamental/technical analysis and passive investing. Recently, new type of approach has become more and more popular – algorithmic trading. In this post I will create and evaluate Decision Tree model to predict direction of next 14 days’ price movement.

Web scraping is more than just parsing HTML...

Modern web is endless source of all kinds of data. Viewing those data only through web browser is limiting. This is where web scrapers come to play. Ability to programmatically access and extract internet resources opens a new broad range of possibilities for developers.

Creepy Scraper

Nowadays, it’s not a problem to find personal information about someone. And I’m not talking about famous people. Everyone dumps tons of data about themselves into various social media. It’s part of our lifestyle now and I am not visionary here. Is it good or bad? It’s not topic for this post. But it lies close to it...

Proxy harvester

It is always nice to have proxies.... unfortunately, it is not the same while you have to pay for them. Especially when you just want quite reliable proxies from time to time, just for your personal scraping usage. You may not want to care about VPNs or whatever. What you can do?