I am a hardworking, passionate and imaginative data geek. I have worked with various data, such as NASA satellite imagery, transactional databases or unstructured texts to name a few of them. I have specialized knowledge in all stages of data engineering pipelines - integrations with source systems, data modeling, ETL processes and data visualization. As person who comes from scientific background I also have understanding of statistics and machnine learning.
There is never a good time to stop learn, so I spend my free time reading books, following online courses or haking my side-projects. When there is a chance, I like to attend workshops and meetups.
Currently I work at Facebook as Data Engineer where I can use all my strengths to bring the world closer together.
Facebook's mission is to give people the power to build community and bring the world closer together. Through our family of apps and services, we're building a different kind of company that connects billions of people around the world, gives them ways to share what matters most to them, and helps bring people closer together. Whether we're creating new products or helping a small business expand its reach, people at Facebook are builders at heart. Our global teams are constantly iterating, solving problems, and working together to empower people around the world to build community and connect in meaningful ways. Together, we can help people build stronger communities — we're just getting started.
As Data Engineer at WorldRemit, my key responsibilities were:
At the beginning of my work my main tasks were building python and java-script web-scrapers and screening/formatting gathered data. To accomplish my goals I have used my problem-solving skills and creativity. Often I had to look for different work around to get data I wanted. Along with python and js I used vastly: Regular Expressions and Xpaths.
Soon I have got a new responsibilities, ie. contributing to python data extraction framework. I have written functions (and unit-tests for them) to make data extraction and cleaning simpler and more automated.
During my work I also built 2 UIs. First one was web-application (written in python’s flask) for managing team internal work. The second one was visual tool for creating advanced configuration files for earlier-mentioned extraction framework. For later, I used node-red, which is node-js flow-based programming tool. The idea of the 2nd UI appeared after I won (together with my coworker) one of company’s hackathon. We presented there prototype of tool written in python and it’s tkinter library. After that success, the project landed on the official product road-map.
Strategical analysis of optimal localization of company’s warehouses in the territory of United Kingdom. I was responsible for transforming company's abstract calculation algorithm into computer application. I have used Python with pyshp and tkinter libraries. I have also created visual presentation of outcomes on digital maps (for this purpose I used open source GIS software – QGIS).
I was running classes and preparing materials for didactic purposes for future GIS specialists. Topics I have covered during my classes included Python language syntax, data types, creating custom functions, conditional statements, loops, writing more complex scripts, basics of OOP and ArcPy library. My goal was to give students a solid foundation for programming in Python and using ArcPy library for spatial analysis.
I was hired to develop algorithm for automate building detection from high resolution aerial images. It was a part of bigger application designed to improve property tax collecting system. Project was innovative because it utilizes the type of images usually not used for this purpose. To create appropriate solution I had to combine knowledge from remote sensing and image processing along with my own ideas. At the end, validation tests showed that my algorithm's detection rate was about 98%.
During my work I also wrote couple of python scripts to automate coworkers task and took part in other projects, eg. updating cities and addresses database or updating National Topographic Database. My role consisted mainly from data entering, but required also writing SQL queries and data munging.
This internship was the result of my scientific research efforts that I made during my master’s studies. I was responsible for terrestrial and mobile laser scanning field measurements and post-processing gathered data in specialized photogrammetric software.
Course included modules on rational databases, SQL queries, programming with VBA (digital image processing, eg. transformations, filters and pixel-based calculations) and GIS programming.
I started my scientific projects on the use of ICESat satellite data to measure global tree heights and improving accuracy of SRTM digital elevation model (which is used inter alia by Google Earth)
I have gained the best possible score for my Master's thesis. I investigated there possibility of improving polish spatial database system with ICESat data.
ICESat was the satellite mission, whose primary goal was to monitor polar regions. The satellite was also gathering information about height and vertical structure of clouds and land topography. Main objective of this publication was to asses if data acquired by ICESat are useful for polish spatial data system. To accomplish this goal, horizontal and vertical accuracy of ICESat measurements were checked. It was also analyzed if this data can be used for estimating canopy height.
Digital elevation model (DEM), i.e. digital representation of the surface of the Earth, is important data source for most of the Earth sciences. Near-global DEMs like the SRTM C-Band enable to understand the Earth as a complex system. Despite of its numerous applications, the SRTM C-Band tends to overestimate elevations over areas where vegetation is present. A novel approach utilizing ICESat ground control points was developed to remove this positive elevation bias.
A 3-day course presented internationally by leading data warehousing experts, covering the latest techniques in data warehousing and BI systems. This course gave me solid background in translating business requirements into efficient and flexible DWH design. Focus was put on planning, designing and developing DWH solution in incremental and agile manner.
Some of most interesting lectures and workshops I was able to attend included topis like: reproducing environments with Docker, mathematical fundamentals of neural networks, data science with R, ML with Python for quant trading and telling the story behind your data.
The class gave me strong core understanding of the JS language and its execution model. It was driven by exercises and delivered me knowledge required to make effective use of JS on the back- or front-end. Some of inclued topics: scope, closures, functions, data structures, combining OOP and functional programming.
I was accepted as one of the youngest participants, because of my scientific achievements. Workshops included lectures and practical exercises about vastly understood Earth observing systems and basics of data assimilation and machine learning.
11 weeks course contains theoretical and practical knowledge about most advanced machine learning algorithms. Some of the covered topics were: supervised/unsupervised ML and building ML systems - debugging, bias/variance, learning curves, error analysis, ceiling analysis and many more.
12 weeks course provides a chance to learn about methods for using data to answer questions of cultural and economic interest. Course covers basics of statistic: probability, random variables and their distribution, Bayes' theorem and more. Also Machine Learning and Data Visualization topics are covered. During the course R is used.
In this course, key concepts in data acquisition, preparation, exploration, and visualization where presented. Along with theoretical knowledge, practical examples how to build a cloud data science solution using Azure Machine Learning, R, and Python was introduced.
Course contained working knowledge of Linux, navigation through major Linux distributions, system configuration and basic shell scripting.