One thing I love about machine learning is the modeling part where you will build a model from anyone one of the renowned libraries and see the results all by yourself. It can be either your model finds out “ Hey the image you loaded is a Cat” or “The text you loaded in was negative”. Why it feels so good to see those results?
For us humans, it’s not a challenging task or something we don’t feel happy about after recognizing a cat or a dog. But imagine you training a machine preferably a machine learning model to detect images of cats and dogs, or find out whether the text is negative or positive is one hell of a job.
It’s the same joyous feeling when you run over an unknown foreigner who speaks your native language very well.
The more I was obsessed with building models but I failed to care about the bigger picture, that is data. Data is crucial for building a model whereas my obsession was on building models. Caring nothing but less about my data I tend to model it and wait for hours to train, now feel like an idiot.
And I understood one thing about a model that is Garbage in, Garbage out. What you feed inside a model comes out.
I came across this thing called Exploratory Data Analysis (EDA) where people do some analysis on the data and draw some insights before stepping into modeling. Truth to be told I never liked this and ignore this for my most of the projects.
Realizations began to hit when I stuck in the same paradigm of training a model for hours and hours with no increase in the performance. Looking back I understood the importance of data and becoming one with the data before modeling is crucial.
This shows modeling is a small part of a Machine Learning cycle and most of the hassle can be avoided if we show considerate care on the data.
And yeah, a lesson learned.