How would you roll your own data science course?
Last week I mentioned I was taking an online data science course. This week I made the difficult decision to drop it and look for a solution better suited to me, my learning habits, and what I’m looking to do. Based on the syllabus for the course and the work I did so far, I’ve put together this list of topics I’d like to improve upon and the resources I’ve found (so far) to help with that. What should I add?
Databases
- Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement: I’m hoping this book will provide a solid survey of database technologies available, and help me find the best next steps for each as needed.
- SQL refresher: One thing I realized in the course is I’ve gotten a little too reliant, sometimes, on a good object-relational mapper. What’s a good resource for getting my hands dirty again with plain old SQL? What about O’Reilly’s Learning SQL or Zed’s Learn SQL the Hard Way?
Python
I knew enough Python prior to starting the class to get through the homework. I did take advantage of the Python track on Codecademy to refresh my memory just enough. Is Python the scripting language for data science, or just the language of choice for the teaching staff? Whither Ruby? Or, if you wanted to get from beginner Pythonista to advanced beginner, what would you read next?
R
- I read Exploring Everyday Things with R and Ruby last year; may need to revisit it.
- There’s An Introduction to R, available for free online, which I haven’t yet read.
- Code School has a free Try R course I also need to review.
Math and statistics
- I’ll admit it. I haven’t taken a math class in 20 years. I had to revisit matrices on Khan Academy. Are there other resources for old dudes who suddenly find themselves having to do algebra (or more) again?
- I’ve got a copy of Think Stats ready to read at some point.
What else?
I’m sure this list is missing things; what should I add?