This article was first published on
python – paulvanderlaken.com , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.
Want to share your content on python-bloggers? click here.
Amit Ness gathered an impressive list of learning resources for becoming a data scientist.
It’s great to see that he shares them publicly on his github so that others may follow along.
But beware, this learning guideline covers a multi-year process.
Amit’s personal motto seems to be “Becoming better at data science every day“.
Completing the hyperlinked list below will take you several hundreds days at the least!
Learning Philosophy:
- The Power of Tiny Gains
- Master Adjacent Disciplines
- T-shaped skills
- Data Scientists Should Be More End-to-End
- Just in Time Learning
Index
- Have basic business understanding
- Be able to frame an ML problem
- Be familiar with data ethics
- Be able to import data from multiple sources
- Be able to annotate data efficiently
- Be able to manipulate data with Numpy
- Be able to manipulate data with Pandas
- Be able to manipulate data in spreadsheets
- Be able to manipulate data in databases
- Be able to use the command line
- Be able to perform feature engineering
- Be able to experiment in a notebook
- Be able to visualize data
- Be able to to read research papers
- Be able to model problems mathematically
- Be able to structure machine learning projects
- Be able to utilize version control
- Be able to use data version control
- Be familiar with fundamental ML algorithms
- Be familiar with fundamentals of deep learning
- Be able to implement models in scikit-learn
- Be able to implement models in Tensorflow and Keras
- Be able to implement models in PyTorch
- Be able to implement models using cloud services
- Be able to apply unsupervised learning algorithms
- Be able to implement NLP models
- Be familiar with Recommendation Systems
- Be able to implement computer vision models
- Be able to model graphs and network data
- Be able to implement models for timeseries and forecasting
- Be familiar with Reinforcement Learning
- Be able to optimize performance metric
- Be familiar with literature on model interpretability
- Be able to optimize models for production
- Be able to write unit tests
- Be able to serve models as REST APIs
- Be able to build interactive UI for models
- Be able to deploy model to production
- Be able to perform load testing
- Be able to perform A/B testing
- Be proficient in Python
- Be familiar with compiled languages
- Have a general understanding of other parts of the stack
- Be familiar with fundamental Computer Science concepts
- Be able to apply proper software engineering process
- Be able to efficiently use a text editor
- Be able to communicate and collaborate well
- Be familiar with the hiring pipeline
- Broaden Perspective
To leave a comment for the author, please follow the link and comment on their blog: python – paulvanderlaken.com .
Want to share your content on python-bloggers? click here.