**Week 8: Assignment 8 - NPTEL Machine Learning Course**

This assignment covers critical topics from **Week 8 of the NPTEL course** on machine learning. It focuses on methods such as bagging, gradient boosting, random forests, graphical models, and Bayesian networks. The questions challenge students to apply their understanding of these concepts to practical scenarios, including regression, classification, and probabilistic models.

**Question 1:**

*In bagging technique, the reduction of variance is maximum if:*

- (a) The correlation between the classifiers is minimum
- (b) Does not depend on the correlation between the classifiers
- (c) Similar features are used in all classifiers
- (d) The number of classifiers in the ensemble is minimized

**Answer:** (a) The correlation between the classifiers is minimum

**Reason:** Bagging (Bootstrap Aggregating) works by reducing the variance of the model. If the individual classifiers are weakly correlated, they will make different errors on the data, allowing the ensemble to average out the mistakes and reduce the overall variance.

**Question 2:**

*If using squared error loss in gradient boosting for a regression problem, what does the gradient correspond to?*

- (a) The absolute error
- (b) The log likelihood
- (c) The residual error
- (d) The exponential loss

**Answer:** (c) The residual error

**Reason:** In gradient boosting for regression, squared error loss leads to gradients that correspond to the residual errors between the predicted and actual values. The gradient descent procedure minimizes these residuals over iterations.

**Question 3:**

*In a random forest, if T (number of features considered at each split) is set equal to p (total number of features), how does this compare to standard bagging with decision trees?*

- (a) It is exactly the same as standard bagging
- (b) It will always perform better than standard bagging
- (c) It will always perform worse than standard bagging
- (d) Cannot be determined

**Answer:** (a) It is exactly the same as standard bagging

**Reason:** In a random forest, T is the number of features randomly selected at each split. If T is set equal to p, all features are considered at each split, making the method identical to standard bagging.

**Question 4:**

*Consider the following graphical model. Which of the following are true about the model? (Multiple options may be correct.)*

- (a) d is independent of b when c is known
- (b) a is independent of c when b is known
- (c) a is independent of d when b is known
- (d) a is independent of b when c is known

**Answer:** (a) d is independent of b when c is known, and (b) a is independent of c when b is known

**Reason:** The structure of the graphical model (Bayesian network) defines conditional dependencies. Understanding the relationships between nodes is critical for identifying which variables become independent when certain others are known.

**Question 5:**

*Consider the Bayesian network given in the previous question. Let “a”, “b”, “c”, “d”, and “e” denote the random variables shown in the network. Which of the following can be inferred from the network structure?*

- (a) “A” causes “d”
- (b) “C” causes “d”
- (c) Both (a) and (b) are correct
- (d) None of the above

**Answer:** (b) “C” causes “d”

**Reason:** From the Bayesian network structure, we can infer the causal relationships. The directed edges represent cause-effect relationships, and in this network, “c” directly influences “d.”

**Question 6:**

*A single box is randomly selected from a set of three. Two pens are then drawn from this container. These pens happen to be blue and green colored. What is the probability that the chosen box was Box A?*

Box | Green | Blue | Yellow |
---|---|---|---|

A | 3 | 2 | 1 |

B | 2 | 1 | 2 |

C | 4 | 2 | 3 |

- (a) 37/18
- (b) 15/56
- (c) 18/37
- (d) 50/15

**Answer:** (c) 18/37

**Reason:** This question applies Bayes’ theorem to calculate the posterior probability that the chosen box was A given that the pens drawn are blue and green. By using conditional probabilities, we can compute the likelihood and then solve for the probability.

**Question 7:**

*True or False: The primary advantage of the tournament approach in multiclass classification is its effectiveness when using weak classifiers.*

**Answer:** True

**Reason:** Tournament-based approaches are especially useful in multiclass classification scenarios with weak classifiers because they reduce the problem into smaller binary classification tasks, which can be more efficiently solved.

**Question 8:**

*A data scientist is using a Naive Bayes classifier to categorize emails as either "spam" or "not spam." The features used for classification include:*

- Number of recipients (To, Cc, Bcc)
- Presence of "spam keywords" (e.g., URGENT, offer, free)
- Time of day the email was sent
- Length of the email in words

*Which of the following scenarios, if true, is most likely to violate the key assumptions of Naive Bayes and potentially impact its performance?*

- (a) The length of the email follows a non-Gaussian distribution
- (b) The time of day is discretized into categories (morning, afternoon, evening, night)
- (c) The proportion of spam emails in the training data is lower than in real-world email traffic
- (d) There's a strong correlation between the presence of the word "free" and the length of the email

**Answer:** (d) There's a strong correlation between the presence of the word "free" and the length of the email

**Reason:** Naive Bayes assumes that features are independent. If there's a strong correlation between two features (e.g., the word "free" and the length of the email), this assumption is violated, which can lead to poor classification performance.

**Question 9:**

*Consider these two statements:*

**Statement 1:**Every Bayesian Network is inherently structured as Directed Acyclic Graphs (DAGs).**Statement 2:**Each node in a Bayesian network represents a random variable, and each edge represents conditional dependence.

*Which of these are true?*

- (a) Both the statements are True
- (b) Statement 1 is true, and statement 2 is false
- (c) Statement 1 is false, and statement 2 is true
- (d) Both the statements are false

**Answer:** (a) Both the statements are True

**Reason:** A Bayesian Network is a Directed Acyclic Graph (DAG), where each node represents a random variable, and the directed edges represent conditional dependencies between variables. This is a fundamental property of Bayesian Networks.