We will be looking at Spam Filtering with a real data set that has a “label” for every email – i.e. spam or not spam. We will use logistic regression classifier to solve this assignment and submit the outputs in a kaggle competition. The assignment goes from data loading to data inspection to data pre-processing to creating a train/test data set to finally doing machine learning, making predictions and evaluating it. This is typically one part of the “full pipeline” in ML modeling/prototyping