The most important lesson I learned as a Data Scientist that translated into my whole life.

5 min readMay 30, 2021

One of the very first things we learn as Data Scientists is the concept of a metric. A metric is nothing more than a measurement that we use to track success or failure.

Let me give you an example:

Imagine I’m the owner of a retail clothing shop and let’s imagine that I’ve picked revenue as a metric to determine whether my store is succeeding or not. Is this the best metric that I can use? Well, the more revenue the better my store is doing right? Not necessarily, we are not taking into account a very important metric when discussing success in retail, expenses. So maybe it’s better that we think of our store’s success in terms of profit instead of revenue, as it may give us a better idea of how our business is doing.

This demonstrates how easy it is to think we are doing well while in reality the numbers paint a different picture!

This same principle can and is applied in the field of Machine Learning and Artificial Intelligence (AI). After working for almost 2 years in the field, I’ve come to the conclusion that these models hold very little to no intelligence, most of the time they are actually pretty dumb and unreliable. They are like singleminded individuals who can do one very specific thing really well (sometimes).

So let’s say that I have an AI system that predicts if there’s a cat in a picture or not, this is a common problem in Data Science called binary classification where we have two labels, Cat (1), Not a Cat (0). One of the most common and straightforward metrics to evaluate how our model is doing is accuracy.

Pretty simple right? At first glance it looks like a perfect fit to our problem. Let’s assume our model has an accuracy of 99%. With this information alone can you be certain this system is reliable enough to be used in a production environment?

Unfortunately, the answer is no! Why? Let’s take a look at the data…

If you quickly glance at the graph you can see a problem, there are way more examples of Cat pictures than Not a Cat pictures.

What is probably happening is that the system was trained on a distribution mainly composed of cat images. This is a classical issue we find in classification problems called data imbalance. What’s happening? Since our model mostly sees cat images, it will learn to only predict cats, and that will still achieve a high accuracy rate. For example, if we have a correctly sampled validation set composed of 5000 examples, we will have approximately 4950 cat images and 50 non cat images.

If the model predicts that all images are cats it will have an accuracy of 4950/5000 = 99% which looks absolutely amazing but it actually offers us no information in terms of how reliable our model is at knowing the difference between cats and not cats.

There are other metrics that can help us have a better and more accurate view of our model’s performance like Cohen’s Kappa score or F1 score, or by taking a quick look at the confusion matrix. Explaining these is out of the scope of this article but you can check this https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/ to find out more about them.

My point goes a bit deeper and intersects with our everyday life. Sometimes we end up choosing the wrong metrics to evaluate our performance in various areas of our life. Another example I want to use is weight loss, something that like many, I’ve struggle with for many years before finally understanding what I was doing wrong.

Growing up a bit on the chubby side, I had never seen my abs and because of that I had set myself a goal of seeing them. But even when I lost a lot of weight they were nowhere to be found! At the time, I was doing a lot of Muay Thai and even competing in some amateur competitions, I weighed 70 kg and I had a flat stomach yet I still couldn’t see my abs.

Do you already see what I was doing wrong? For years I was looking at the wrong metric, I was obsessed with weight, checking it everyday and I thought that the lower the number on the scale the closer I was to my goal. Unfortunately, just like the previous machine learning model example, I was optimising a metric that did not reflect my goal!

Today I use other metrics such as body fat percentage, fat weight and my favorite of all: simply looking in the mirror and being honest with myself. After 2 months of paying attention to these metrics I achieved what I hadn’t been able to do for 4 years. I finally saw my abs.

So, my final question to you dear reader is: which metrics are you choosing to evaluate success in the various aspects of your life? Are you sure they are the right ones? Have you looked for alternatives?

What metric do you pick to evaluate success in life? Work, money, work-life balance, travelling?

My advice is that you carefully pick one (or several)! There is no absolute metric that is better than another, some just fit your goals better than others! Most of us have different goals, so it’s normal to also have different metrics to evaluate our lives!

The most important lesson I learned as a Data Scientist that translated into my whole life.

Written by Zé Conceição