We all know about Machine Learning. If you're reading this, I believe you've already seen my previous blog, where I explained what Machine Learning is in a very basic and easy way. If not, I recommend checking it out first — it will give you a solid starting point and help you understand what we’re about to learn here.
In that blog, I also gave a complete roadmap on how to start learning ML. So before continuing here, take a quick look at it to get a clear picture of where you're heading.
Now, let’s get into the first part of this learning series. In this blog, we’re going to explore how Math is used in Machine Learning. We’ll understand why Math is important, what types of Math are involved, and how it all connects to real ML models.
I’ll explain everything — even the toughest concepts — in a simple and beginner-friendly way. You don’t need to stress. Just spend the next 5 minutes reading, and I promise this will be super useful for your future.
Linear Algebra in Machine Learning
In this section, we are going to learn about Linear Algebra and how it is used in Machine Learning. Let’s begin with some basic concepts
Vectors: These are nothing but features in a dataset. For example, if you have a dataset from a college, the features might include, Student ID, Admission type (Management or Counselling). These types of information are called features, and in linear algebra, they are treated as vectors.
Matrix: When you bring all those features together across many students or staff, the entire dataset is represented as a matrix. Just imagine: a matrix = full table of data.
Tensor: A tensor is an extension of vectors and matrices.It includes higher-dimensional data, like: Images (2D), Videos (3D), Audio files. Tensors are heavily used in deep learning.
Addition: Used for combining features or predictions. For example, you might combine results from different models to make one final prediction.
Multiplication: This is used when you pass data through layers in a neural network. It multiplies weights with input features. Multiplication is one of the core operations in Deep Learning.
Transpose: Used during the Gradient Descent process — especially when we update weights and need to reverse rows and columns. We’ll cover Gradient Descent in detail later.
Dot Product: It’s a type of multiplication used to measure similarity between two vectors. This is commonly used in Neural Networks.
Cross Product: Used in Robotics and 3D animation — helps calculate directions between 3D vectors.
Identity Matrix: This is a special type of matrix that acts like the number “1” in multiplication. If you multiply any matrix by the identity matrix, the output is the original matrix itself. It is used to preserve the values during certain ML calculations.
Inverse Matrix: Inverse matrices are used in solving equations — especially in regression problems. For example, in linear regression, we use inverse matrices to calculate the best-fit line.
Eigenvectors and Eigenvalues: These are used for dimensionality reduction. You’ll learn about PCA (Principal Component Analysis) later, which is based on this.
Matrix Factorization: Used in hidden layers of deep learning models. This is an advanced concept but plays an important role in modern ML.
Probability and Statistics in Machine Learning
In this part, we’re going to learn about Probability and Statistics — two important concepts in Machine Learning.
Probability Theory: In machine learning, we often deal with data that is not 100% certain. So, probability helps us make decisions when we are not fully sure — like guessing the next word, predicting whether it will rain, or whether an email is spam or not.
Descriptive Statistics : Descriptive statistics helps us understand and summarize the data. It tells us what the data looks like before we even start building models. Some basic tools used here are: Mean, Median, Mode, Range, Variance and Standard Deviation. These are all used to analyze data before training models.
Bayes’ Theorem: Bayes’ Theorem is used to update our beliefs based on new evidence. For example, let’s say a model has a certain belief based on old data. When new data comes in, Bayes' Theorem helps update that belief. This is commonly used in email spam detection, disease prediction, etc.
Gaussian Distribution (Normal Distribution): This is one of the most common data patterns in the world. It looks like a bell-shaped curve — most values fall near the center, and fewer values fall near the edges. Many ML models (especially regression) assume that data follows this shape.
Standard Deviation & Variance: These tell us how much the data is spread out. A low standard deviation means the data points are close to the average. A high standard deviation means the data points are spread out. This is useful when you're checking for outliers or noisy data.
Correlation & Covariance: These help us understand the relationship between two variables. Correlation: shows how strongly two things are related (like height and weight). Covariance: tells us the direction of the relationship (positive or negative)
Maximum Likelihood Estimation (MLE): MLE is used to find the best values in a model so it can explain the data properly. For example, when training a model, we try to find the weights or parameters that make the predictions most accurate — that’s what MLE does.
Kullback–Leibler Divergence (KL Divergence): This is a bit more advanced. KL Divergence measures how different one probability distribution is from another. It is used in deep learning and probabilistic models when we want to compare actual vs. predicted distributions.
Calculus in Machine Learning
Now let’s talk about Calculus, another important part of Machine Learning — especially in training models like Neural Networks.
Limits: Limits help us understand how a function behaves as it gets closer to a certain point. In ML, limits are used in concepts like gradients where we analyze small changes in data to optimize our model.
Derivatives: A derivative tells us how fast something is changing. In Machine Learning, it is used to adjust the model’s parameters based on how the output is changing.
Chain Rule: The chain rule helps us take the derivative of functions inside other functions. This is super important in Deep Learning, especially when we’re updating weights in different layers of a neural network. It helps the model learn from one layer to the next.
Gradient Descent: Gradient Descent is the heart of training a machine learning model. It’s a method where we try to reduce the loss (error) step by step, by updating the weights of the model using derivatives. That’s what gradient descent does — it helps us move in the right direction to reduce the error.
Backpropagation: Backpropagation is the process of updating weights in a neural network. It uses derivatives and chain rule to calculate how much each weight should change to reduce the error. This is how models learn from their mistakes.
Final Takeaway
Yeah, this blog might have made you feel a bit sleepy or even bored — because, yeah, it’s fully about math. Absolutely, math is one of the most irritating subjects for many of us. But the sweetness of math comes only when you know its real beauty. We are using math every day — without even realizing it. While we sleep, eat, or even watch movies, math is working silently in the background. We just don’t notice it. But it’s there — because the entire world is built on math. You can even say: “The world itself is a matrix.” So instead of running away from it, let’s welcome math with an open mind. You don’t need to learn every formula or technical definition. Just understand the concept — why it's used, and how it connects with Machine Learning.
In this blog, I’ve shared many subtopics. Don’t try to remember everything now. Just understand what each topic means. That’s enough for now. You might’ve noticed I mentioned regression models earlier. If you’ve read my previous blog, then you can now easily connect how math is used in regression models. Try this: go to YouTube or Google and search “How does Linear Regression use math?” “How does Gradient Descent use calculus?” Watch a simple explanation and compare it with what you’ve learned here. Trust me, it’ll all start making sense now.
So yeah — this blog may have felt a bit heavy, but I’m damn sure the next ones won’t be boring! Because from here, we’ve finished the math part. 🎉 We are going to dive deep into real Machine Learning models now. You’ll enjoy that part a lot — I promise! So keep your notifications ON, and make sure to follow my page. I’ll be sharing more exciting content in upcoming blogs.
See you in the next one! 🚀
Comments
Post a Comment