I’m not sure how much I’ve ‘learnt’ in the past week and I think this is because it really depends on what learning something means. Often I find myself preferring to ‘learn’ a new thing rather than applying something I have supposedly learnt.
This is largely because something like a Kaggle competition can seem like quite a daunting prospect compared to the satisfying feeling of watching videos which somehow feels like accomplishing something. I have recently started using Codewars, which I’m thoroughly enjoying and this is where I really feel I am learning the things I’ve already supposedly learnt through books such as ‘Automate the Boring Stuff’. Codewars is a dojo themed code challenge website, where you are given problems of increasing difficulty as you climb ranks. You’re given real problems that and are encouraged to refactor your code and follow best practices.
Being on a website such as Codewars or Kaggle does one or two things for me: it either highlights how much I’ve actually learnt on the various online courses or books I’ve started or it highlights how much I have not learnt. That feeling of understanding I felt when I went through the lesson on dataframe index manipulation goes out the window when I realise I have no idea how to actually apply it. Therefore it’s my intention to stop shying away from this and really start focusing on the application.
Practical application websites
- CodeWars - as mentioned above, short code challenges of increasing difficulty - more programming focused
- Analytics Vidya - not actually used this yet, just seen it mentioned
- Kaggle - a data science focused project/competition site where you build kernels to describe your work
- DrivenData - like Kaggle but with a focus on projects that can have a positive social impact
- HackerRank - for all programming languages - practice, compete or look for jobs - their 30 days of coding is worth doing
So are video courses a waste of time in your journey to real knowledge?
All of this is not to say that I feel I’ve wasted my time on websites like DataCamp or Udemy. I believe that going through these websites at some point, starting courses and drifting away from them is almost a rite of passage. Whilst I definitely have seen infinitely greater and more worthwhile improvements when I started applying or learning through a problem, I wouldn’t advise others away from using these sites.
However I know for myself, I was far happier with my progress with Python once I started handling our AB test reporting through BigQuery from Google Analytics into Jupyter Notebooks whilst at Eurostar. It’s in these applications you see the beauty of loops, classes, dictionaries, dataframes, local vs. global variables and the rest.
My habit can tend to be spending 10 hours studying to each hour spent applying the knowledge I have. I tend to spend more time learning as much as possible from books, videos, courses, articles rather than apply what I’ve already learnt. My aim to try equalise this ratio to one hour learning for each hour doing. I’ll need to ensure I keep to this as I still find myself returning to comfortinglty simple video courses where it is quite easy to go into auto-pilot mode if you don’t do the practice problems and code alongside it.
Of course this is absolutely not the way the writers/teachers of the courses intended for their students to learn on the course. They have practice problems and applications. It’s just I find the walkthroughs often provided make it too easy for the weak-willed such as myself to think it all makes sense. This is not a criticism of the courses because of course you’re meant to take the knowledge learnt there and apply it.
Therefore I definitely don’t think the video courses are a waste of time, but if this is the only way you’re learning data science, I don’t think you’ll find yourself learning much, or at least remembering much.
Find your own problems
Another benefit of sticking to problems is you will learn what is relevant to you. I have certainly found myself veering into the more programming side of things as I have enjoyed making ‘cool’ things happen from the command line but I am also aware that some of this may not be useful to me in my career.
I certainly don’t regret this and I am lucky enough to have the luxury of time and can indulge my desire to learn new and random things while I’m travelling. However, I am seeing the need to add more structure to my learning as I find myself being pulled in by the idea of learning Django or other random things which likely will not be useful to me in my career as an data and optimisation analyst/aspiring data scientist.
Aims going forward
- Create a whatsapp chat analyser - this is a fun and seemingly simple way to incorporate dataframes, regex, functions and plots that I’ve seen a few times on Reddit’s r/dataisbeautiful
- Complete first full competition on Kaggle, Analytics Vidya and DrivenData
- Create a learning strategy to keep me focused on what’s relevant