I am a hardworking, passionate and imaginative data geek. I have worked with various data, such as NASA satellite imagery, Big Data, transactional databases or unstructured texts to name a few of them. I have specialized knowledge in all stages of data engineering pipelines - integrations with source systems, data modeling, ETL processes and data visualization. As person who comes from scientific background I also have understanding of statistics and machnine learning.
There is never a good time to stop learn, so I spend my free time reading books, following online courses or haking my side-projects. When there is a chance, I like to attend workshops and meetups.
Currently, I work as Data Engineering Manager at Facebook where I build a high-quality DE team supporting FB's Ads & Business Platform.
Facebook's mission is to give people the power to build community and bring the world closer together.
I support FB's Ads & Business Platform, more specifically - the backend ads delivery system powering our personalized ads experience. My role responsibilities are:
I focused on delivering my mission of supporting FB's Ads & Business Platform products by delivering the best data foundation that drives impact through informed decision making. My key responsibilities were:
Technologies: Presto, Spark, Hive, Python, internal equivalents of Airflow, Tableau, Jupyter
As Data Engineer at WorldRemit, my key responsibilities were:
Technologies: Python, Java-Script, PostgreSQL, AWS, Anthena, Redshift, Airflow, Flask, Ansible
Worked with Data Operations Team to create high-quality and structured datasets using web-scraping and data manipulation techniques. Responsibilities:
Technologies: Python, Java-Script, node-js, Regular Expressions, X-Paths, Jankins
Strategical analysis of optimal localization of company’s warehouses in the territory of United Kingdom.
Technologies: Python, Q-GIS
I worked in the Remote Sensing department on a software prototype for automating building detection from high-resolution aerial images. My responsibilities were:
Technologies: Python, Q-GIS, SAGA GIS
This internship was the result of the scientific research efforts that I made during my master’s studies. I was responsible for terrestrial and mobile laser scanning field measurements and post-processing gathered data in specialized photogrammetric software.
Course included modules on rational databases, SQL queries, programming with VBA (digital image processing, eg. transformations, filters and pixel-based calculations) and GIS programming.
I started my scientific projects on the use of ICESat satellite data to measure global tree heights and improving accuracy of SRTM digital elevation model (which is used inter alia by Google Earth)
I have gained the best possible score for my Master's thesis. I investigated there possibility of improving polish spatial database system with ICESat data.
Digital elevation models (DEM), including the Shuttle Radar Topography Mission (SRTM), are used in many branches of geoscience as an ultimate dataset representing our planet’s surface, making it possible to investigate processes that are shaping our world. The SRTM model exhibits elevation bias or systematic error over forests and vegetated areas due to the microwaves’ peculiar properties that penetrate the vegetation layer to a certain depth. Numerous investigations identified that the penetration depth depends on the forest density and height. In this contribution, two methods are proposed to remove the impact of the vegetation impenetrability effect.
ICESat was the satellite mission, whose primary goal was to monitor polar regions. The satellite was also gathering information about height and vertical structure of clouds and land topography. Main objective of this publication was to asses if data acquired by ICESat are useful for polish spatial data system. To accomplish this goal, horizontal and vertical accuracy of ICESat measurements were checked. It was also analyzed if this data can be used for estimating canopy height.
Digital elevation model (DEM), i.e. digital representation of the surface of the Earth, is important data source for most of the Earth sciences. Near-global DEMs like the SRTM C-Band enable to understand the Earth as a complex system. Despite of its numerous applications, the SRTM C-Band tends to overestimate elevations over areas where vegetation is present. A novel approach utilizing ICESat ground control points was developed to remove this positive elevation bias.
Summer Bootcamp to learn more about the world of Data Science and transition into the Data world. I run "Advanced SQL" session.
A 3-day course presented internationally by leading data warehousing experts, covering the latest techniques in data warehousing and BI systems. This course gave me solid background in translating business requirements into efficient and flexible DWH design. Focus was put on planning, designing and developing DWH solution in incremental and agile manner.
Open Data Science Conference (ODSC) is an annual event held internationally. The purpose of ODSC events is to discuss data science and ML topics, as well as provide training sessions. Some of the training sessions covered: reproducing environments with Docker, mathematical fundamentals of neural networks, data science with R, ML with Python for quant trading and telling the story behind your data.
The class gave me strong core understanding of the JS language and its execution model. It was driven by exercises and delivered me knowledge required to make effective use of JS on the back- or front-end. Some of inclued topics: scope, closures, functions, data structures, combining OOP and functional programming.
I was accepted as one of the youngest participants, because of my scientific achievements. Workshops included lectures and practical exercises about vastly understood Earth observing systems and basics of data assimilation and machine learning.
7 weeks course covering baisc of Artificial Intelligence - Search, Knowledge representation, Uncertainty, Optimization, Machine Learning, Neural Networks and NLP.
11 weeks course contains theoretical and practical knowledge about most advanced machine learning algorithms. Some of the covered topics were: supervised/unsupervised ML and building ML systems - debugging, bias/variance, learning curves, error analysis, ceiling analysis and many more.
12 weeks course provides a chance to learn about methods for using data to answer questions of cultural and economic interest. Course covers basics of statistic: probability, random variables and their distribution, Bayes' theorem and more. Also Machine Learning and Data Visualization topics are covered. During the course R is used.
In this course, key concepts in data acquisition, preparation, exploration, and visualization where presented. Along with theoretical knowledge, practical examples how to build a cloud data science solution using Azure Machine Learning, R, and Python was introduced.
Course contained working knowledge of Linux, navigation through major Linux distributions, system configuration and basic shell scripting.