Photo by the author.

A data pipeline is stuck and needs someone's attention. Or a query needs to be refactored using a new table because one of the used tables is getting deprecated or unavailable. Or we need to add more to our SQL logic of a very old data pipeline.

In all those…


!Photo by Danil Sorokin on Unsplash. Modified by the author.

As a data engineer who works for a large FAANG company, I hear people frequently ask what skills are important to land a Data Engineering job at a well-known tech company? Many people think they need to be fluent in Spark or know everything about Hadoop systems to get a…


Photo by Waldemar Brandt on Unsplash

For data scientists, machine learning models and visualization packages are essential tools. But more importantly, data scientists rely on data and data infrastructures to do their analytics and modeling. Without data and databases, all developed analytics tools and techniques are useless. Many data scientists get their data in raw formats…


Image by the author.

Logging is a popular solution for tracking events in a code or debugging. Many of us (Python programmers and data scientists) have this bad habit of using print() to debug and track events in our codes.

Why using print() for logging and debugging is not a good practice?

  1. The print()…


Photo by on Unsplash

Building Python applications that have graphical user interfaces and are doing sophisticated tasks might look difficult. In a recently published article (see the link below), I mentioned how only 7 Python libraries are needed to start building applications.

This article will show you how to build a simple translation application…


Photo by Ross Findon on Unsplash

I have taught different topics of data science to different groups of scientists (mostly non-data scientists). I had a simple question at the beginning of the class to break the ice. Interestingly, most non-data scientists, unlike data scientists, found it a hard question. I am going to share the question…

