The NPTEL Introduction to Machine Learning course for the July-October 2024 session covers critical topics in Week 3, including linear classification, logistic regression, and Linear Discriminant Analysis (LDA). This week's assignment tests the understanding of these concepts through various questions designed to challenge and enhance learning.

### Question 1

For a two-class problem using discriminant functions ($d_i$ - discriminant function for class $i$), where is the separating hyperplane located?

Given:

- $d_1(\mathbf{x}) = \mathbf{x}^T \mathbf{w}_1 + w_{10}$
- $d_2(\mathbf{x}) = \mathbf{x}^T \mathbf{w}_2 + w_{20}$
- The separating hyperplane is where $d_1(\mathbf{x}) = d_2(\mathbf{x})$.

Since $d_1(\mathbf{x}) = d_2(\mathbf{x})$, we get: $\mathbf{x}^T \mathbf{w}_1 + w_{10} = \mathbf{x}^T \mathbf{w}_2 + w_{20}$ $\mathbf{x}^T (\mathbf{w}_1 - \mathbf{w}_2) = w_{20} - w_{10}$

Therefore, the separating hyperplane is defined by: $\mathbf{x}^T (\mathbf{w}_1 - \mathbf{w}_2) = w_{20} - w_{10}$

**Answer:** $\mathbf{x}^T (\mathbf{w}_1 - \mathbf{w}_2) = w_{20} - w_{10}$

### Question 2

Given the following dataset consisting of two classes, $A$ and $B$, calculate the prior probability of each class.

Feature 1 | Class |
---|---|

2.3 | A |

1.8 | A |

3.2 | A |

1.2 | A |

2.1 | A |

1.9 | B |

2.4 | B |

Calculate $P(A)$ and $P(B)$:

- Number of samples for class A, $n_A = 5$
- Number of samples for class B, $n_B = 2$
- Total number of samples, $n = 7$

Prior probabilities: $P(A) = \frac{n_A}{n} = \frac{5}{7} \approx 0.714$ $P(B) = \frac{n_B}{n} = \frac{2}{7} \approx 0.286$

**Answer:**
$P(A) = 0.714, P(B) = 0.286$

### Question 3

In a 3-class classification problem using linear regression, the output vectors for three data points are $(0.8, 0.3, -0.1)$, $(0.2, 0.6, 0.2)$, and $(0.1, 0.4, 0.4)$. To which classes would these points be assigned?

Assignment is based on the highest output value for each data point:

- Data point $(0.8, 0.3, -0.1)$ -> Class 1 (0.8 is the highest)
- Data point $(0.2, 0.6, 0.2)$ -> Class 2 (0.6 is the highest)
- Data point $(0.1, 0.4, 0.4)$ -> Class 2 (0.4 is the highest, tie between class 2 and class 3)

**Answer:**

- $(0.8, 0.3, -0.1)$ -> Class 1
- $(0.2, 0.6, 0.2)$ -> Class 2
- $(0.1, 0.4, 0.4)$ -> Class 2

### Question 4

If you have a 5-class classification problem and want to avoid masking using polynomial regression, what is the minimum degree of the polynomial you should use?

For a $k$-class problem, to avoid masking, we need to use a polynomial of degree $k-1$.

For 5 classes: $k = 5$ Minimum degree of the polynomial: $k-1 = 5-1 = 4$

**Answer:** 4

### Question 5

Consider a logistic regression model where the predicted probability for a given data point is 0.4. If the actual label for this data point is 1, what is the contribution of this data point to the log-likelihood?

Log-likelihood contribution for logistic regression is given by: $\text{LL} = y \log(p) + (1 - y) \log(1 - p)$ Where $y$ is the actual label and $p$ is the predicted probability.

Given: $y = 1, p = 0.4$

Contribution to log-likelihood: $\text{LL} = 1 \cdot \log(0.4) + (1 - 1) \cdot \log(1 - 0.4)$ $\text{LL} = \log(0.4)$ $\text{LL} \approx -0.9163$

**Answer:** $-0.9163$

### Question 6

What additional assumption does LDA make about the covariance matrix in comparison to the basic assumption of Gaussian class conditional density?

Linear Discriminant Analysis (LDA) assumes that the covariance matrix is the same for all classes.

**Answer:** The covariance matrix is the same for all classes.

### Question 7

What is the shape of the decision boundary in LDA?

In LDA, the decision boundary is linear.

**Answer:** Linear

### Question 8

For two classes $C_1$ and $C_2$ with within-class variances $\sigma_{1}^2 = 1$ and $\sigma_{2}^2 = 4$ respectively, if the projected means are $\mu_{1} = 1$ and $\mu_{2} = 3$, what is the Fisher criterion $J(w)$?

The Fisher criterion is given by: $J(w) = \frac{(\mu_1 - \mu_2)^2}{\sigma_1^2 + \sigma_2^2}$

Given: $\mu_1 = 1, \mu_2 = 3$ $\sigma_1^2 = 1, \sigma_2^2 = 4$

Calculate $J(w)$: $J(w) = \frac{(1 - 3)^2}{1 + 4}$ $J(w) = \frac{4}{5}$ $J(w) = 0.8$

**Answer:** 0.8

### Question 9

Given two classes $C_1$ and $C_2$ with means $\mu_1 = \begin{bmatrix} 2 \\ 3 \end{bmatrix}$ and $\mu_2 = \begin{bmatrix} 5 \\ 7 \end{bmatrix}$ respectively, what is the direction vector for LDA when the within-class covariance matrix $S_W$ is the identity matrix $I$?

For LDA, the direction vector $w$ is given by: $w = S_W^{-1} (\mu_1 - \mu_2)$

Given: $\mu_1 = \begin{bmatrix} 2 \\ 3 \end{bmatrix}, \mu_2 = \begin{bmatrix} 5 \\ 7 \end{bmatrix}$ $S_W = I$

Calculate $\mu_1 - \mu_2$: $\mu_1 - \mu_2 = \begin{bmatrix} 2 \\ 3 \end{bmatrix} - \begin{bmatrix} 5 \\ 7 \end{bmatrix}$ $\mu_1 - \mu_2 = \begin{bmatrix} 2 - 5 \\ 3 - 7 \end{bmatrix}$ $\mu_1 - \mu_2 = \begin{bmatrix} -3 \\ -4 \end{bmatrix}$

Since $S_W = I$, the direction vector $w$ is: $w = I^{-1} \begin{bmatrix} -3 \\ -4 \end{bmatrix}$ $w = \begin{bmatrix} -3 \\ -4 \end{bmatrix}$

**Answer:** $\begin{bmatrix} -3 \\ -4 \end{bmatrix}$