General Concepts

Data Science

Data science is the field of study that combines domain expertise, programming skills, and knowledge of mathematics and statistics to extract meaningful insights from data.

Machine Learning

Machine learning is the science of getting computers to realize a task without being explicitly programmed

Data Sets

Training Set

Set of examples used to fit the parameters of the model. For example if you are training a model to recognize images of different types of fruits, it would be a set of images of fruits that the model would learn.

Validation Set

The validation set provides an unbiased evaluation of a model fit on the training dataset while tuning the model's hyper-parameters

Test Set

The test Set is a dataset used to provide an unbiased evaluation of a final model fit on the training dataset. It is common practice to use this set for evaluating the model only once the model's parameters are completely tuned.

Supervised vs Unsupervised and Their Models

Supervised Learning

In supervised learning, we want to get a model to predict the label of data based on their features.

Classification Model

Classification Models try to predict Categorical labels (i.e. yes/no, species, type of fruit)

Regression Model

Regression models try to predict a numerical Label(number, vector, Bounding Box Coordinate )