How much python is required for data science reddit

I'm a computational biologist, not a data scientist, but... OP's complaint seems to be just what science is. Most of your time isn't spent writing beautiful, novel code. It's thinking, banging out the code, then spending a ton of time tweaking figures and presentations (then in my case, writing. so much writing).

The crunch to "get shit done" can be super real. This is actually particularly bad in biology, where "well, it runs on my machine" is basically the standard for code in published papers (an exception being a paper that's purpose is a new tool). That's if they even provide the code to begin with.

OP I was in your shoes 6 months ago. So I know the pain and anxiety.

Many of the answers are correct here. If you are looking for entry level data analyst job, no one is asking you for python. Even though having python in your toolset would be an added bonus.

Concentrate on sql and advanced excel. Learn a cloud database like gcp or aws. Get data from kaggle or even cdc. Do some projects where you can show your sql knowledge. For beginners I would recommend to be atleast comfortable with joins and unions. If you can do ctes and windows functions, you will be a god send to the team. Put your projects on LinkedIn, so that recruiters may approach you

Last but not the least, you didn't mention what field you are in. Data analysis is a huge field ranging from medicine, insurance, finance to logistics, marketing, machine learning. In my experience, many firms are willing to look over your lack of data analysis knowledge if you have strong foundation in the field. So looking for jobs in your field of specialization is a sure way bet to land an entry level position.

Machine Learning is, IMHO, more of a branch of statistics (statistical learning) rather than programming. The bare theory can be approached without any idea of programming. Some quick examples:

  • A Neuron n : R^n -> R is a non-linear function that can be written as σ(g(x)) where σ : R -> R is a non-linear function and g : R^n -> R is a linear function.

  • A Feed-Forward Layer L : R^n -> R^m is a function such that the i-th component of its output can be written as L_i = n_i(x), where n_i : R^n -> R is a Neuron.

  • A Neural Network N : R^n -> R^m is a function that uses one or more Neurons in its implementation.

  • A Feed-Forward Neural Network F : R^n -> R^m is a function that can be written as F(x) = L(f(x)) where L : R^j -> R^m is a Feed-Forward Layer and f : R^n -> R^j is either a Feed-Forward Neural Network or the function I(x) = x.

  • The Universal Approximation Theorem states that for any given function f : R^n -> R^m and a threshold ε>0 there exists a FFNN F : R^n -> R^m such that for any x in R^n, ||F(x) - f(x)|| < ε. In other words, any given function can be infinitely approximated by a sufficiently large FFNN.

As you can see, there's no computational theory in this part, and these are all basic definitions in the Neural Network side of Machine Learning. Even training algorithms are, at their core, mathematical definitions of finding minimum values of a certain function and so on.

However, if you want to make some ML software, you will need basic programming knowledge:

  • Flow control (if..else, while, for). This is necessary to get started.

  • Object-oriented programming (classes). This is useful to understand how certain libraries work and to manage data beter, but not mandatory to write your first program.

  • Data scraping. This is useful to create your own datasets, but not mandatory to learn ML algorithms since you can use already existing datasets to practice.

  • Deploying a model. This is useful to share what you create with the world without having them install all the software you used but simply have a (for example) webpage that tells them whether the uploaded image is a cat or a dog. Useful later on when you actually make useful stuff.

The good news is that if you understood what I was talking about in the NN part from the math side of things, you are probably well prepared to delve into the math of ML for a bottom-up approach to ML.

The other good news is that if you have basic flow control skills (which I'm more than sure you do), you can start a top-down approach.

If you want a bottom-up approach (notoriously more boring, but more formal and "complete", also more apt for math-oriented people like me), I usually suggest starting with Andrew Ng's course on Coursera. It explains the basics of ML introducing math as it goes, even if you aren't very well-versed in it. It uses Octave for the programming side of things, which is a bit unconventional, but this just goes to show that you don't need special knowledge to start.

If you want a top-down approach, I (strongly) recommend fast.ai's Deep Learning for Coders. This will assume knowledge of flow control and (IIRC) OOP, but it throws you in a very approachable way into Deep Learning, which is one of the "coolest" side of ML, which usually is what interests people. I tend to prefer classical ML rather than Deep Learning, since in my line of work it's also more common, but DL is a must-know to follow improvements in ML and AI. The downside of DL for Coders is that it teaches fast.ai, an unconventional deep learning framework which, although really good IMHO, has a bit less support than others. Either way, the knowledge you'll get is, with a bit of pain, transferable to other frameworks and it is DEFINITELY more beginner-friendly.

If you want something more in the middle, I've recently gone through Hands-On Machine Learning With Scikit-Learn and Tensorflow which is an AWESOME book, although a bit outdated. It explains things with a good mixture of formalism and intuition and explains really useful stuff for real world ML if you are interested in that. It doesn't cover practical NN until later on, but spends a good chunk of time on fundamental ML that is what you will use in your (eventual) ML job. There's also an updated version of the book, Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition, which I bought but still didn't manage to find the time to read it.

TL;DR: if you want to get a job in ML, Hands-On Machine Learning is a must-have book and it's good for beginners, although it holds your hand a little less than other introductory courses. If you want to do cool stuff with ML (computer vision, artificial intelligence, etc.), you probably are interested in DL and I recommend fast.ai's Deep Learning for Coders. If you are a math-oriented person and you don't plan on working with ML in the nearest future, I recommend you start with Andrew Ng's course and then follow up with one of the other two based on what you think you will want to work with.