ABW505 Mock Exam 2: Python & Machine Learning (Verified Answers)
20 min read
ABW505 Mock Exam 2 - Python & Machine Learning
📋 Exam Information
| Item | Details |
|---|---|
| Total Points | 100 |
| Time Allowed | 90 minutes |
| Format | Closed book, calculator allowed |
| Structure | Q1 (20pts) + Q2 (30pts, choose 3/5) + Q3 (25pts) + Q4 (25pts) |
Question 1: Python Output Analysis (20 points)
Answer ALL. Determine exact output.
Q1.1 (5 points)
a = [1, 2, 3]
b = a
b.append(4)
print(a)
print(a is b)💡 Click to View Answer & Explanation
Step-by-step breakdown:
# Step 1: Create a list and assign to variable 'a'
a = [1, 2, 3]
# Memory: a points to list object [1, 2, 3]
# Step 2: Assign 'a' to 'b'
b = a
# IMPORTANT: This does NOT copy the list!
# Both 'a' and 'b' now point to the SAME list object in memory
# Memory: a → [1, 2, 3] ← b
# Step 3: Modify list through 'b'
b.append(4)
# Since a and b point to the same object,
# the change is visible through both variables
# Memory: a → [1, 2, 3, 4] ← b
# Step 4: Print results
print(a) # [1, 2, 3, 4] - modified through b
print(a is b) # True - same object in memoryAnswers:
print(a)→[1, 2, 3, 4]print(a is b)→True
Key concept: In Python, assignment creates a REFERENCE, not a copy.
To create an independent copy:
b = a.copy() # Method 1: copy() method
b = a[:] # Method 2: slice notation
b = list(a) # Method 3: list constructorQ1.2 (5 points)
text = "Hello World"
print(text[0:5:2])
print(text[-5:-1])💡 Click to View Answer & Explanation
Step-by-step breakdown:
text = "Hello World"
# Index map:
# Character: H e l l o W o r l d
# Positive: 0 1 2 3 4 5 6 7 8 9 10
# Negative:-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1
# Line 1: text[0:5:2]
# Format: [start:stop:step]
# start=0 (H), stop=5 (exclusive), step=2 (every 2nd char)
# Indices: 0, 2, 4 → Characters: 'H', 'l', 'o'
result1 = text[0:5:2] # "Hlo"
# Line 2: text[-5:-1]
# start=-5 (W), stop=-1 (exclusive, before 'd')
# Indices: -5, -4, -3, -2 → Characters: 'W', 'o', 'r', 'l'
result2 = text[-5:-1] # "Worl"Answers:
text[0:5:2]→Hlotext[-5:-1]→Worl
Slicing syntax: [start:stop:step]
start: inclusive (default: 0)stop: exclusive (default: end)step: increment (default: 1)
Q1.3 (5 points)
def outer():
x = 10
def inner():
nonlocal x
x += 5
return x
return inner()
print(outer())
print(outer())💡 Click to View Answer & Explanation
Step-by-step breakdown:
def outer():
x = 10 # Local variable in outer's scope
def inner():
nonlocal x # Refers to x in enclosing (outer) scope
x += 5 # Modify outer's x: 10 + 5 = 15
return x # Return 15
return inner() # Call inner() and return its result
# First call: outer()
# - x starts at 10 in a NEW local scope
# - inner() adds 5: x = 15
# - Returns 15
print(outer()) # 15
# Second call: outer()
# - FRESH call creates NEW local scope
# - x starts at 10 again (not preserved from first call)
# - inner() adds 5: x = 15
# - Returns 15
print(outer()) # 15Answers:
- First
print(outer())→15 - Second
print(outer())→15
Key concepts:
nonlocalmodifies variable in enclosing scope (not global)- Each call to
outer()creates a fresh local scope - Variable
xis NOT preserved between calls
Q1.4 (5 points)
nums = [1, 2, 3, 4, 5]
result = [x**2 for x in nums if x % 2 == 1]
print(result)
print(sum(result))💡 Click to View Answer & Explanation
Step-by-step breakdown:
nums = [1, 2, 3, 4, 5]
# List comprehension with filter
# Pattern: [expression for item in iterable if condition]
result = [x**2 for x in nums if x % 2 == 1]
# Step-by-step execution:
# x=1: 1 % 2 == 1? True → 1**2 = 1 → include
# x=2: 2 % 2 == 1? False → skip
# x=3: 3 % 2 == 1? True → 3**2 = 9 → include
# x=4: 4 % 2 == 1? False → skip
# x=5: 5 % 2 == 1? True → 5**2 = 25 → include
# Result: [1, 9, 25]
print(result) # [1, 9, 25]
print(sum(result)) # 1 + 9 + 25 = 35Answers:
print(result)→[1, 9, 25]print(sum(result))→35
Breakdown:
- Filter: odd numbers only (1, 3, 5)
- Transform: square each (1, 9, 25)
- Sum: 1 + 9 + 25 = 35
Question 2: Code Writing (30 points)
Choose 3 out of 5 questions. Each worth 10 points.
Q2.1 - Count Vowels (10 points)
Write a function count_vowels(text) that:
- Counts vowels (a, e, i, o, u) - case insensitive
- Returns the count as an integer
💡 Click to View Verified Answer
def count_vowels(text):
"""
Count the number of vowels in a string.
Vowels are: a, e, i, o, u (case insensitive)
Args:
text: Input string to analyze
Returns:
int: Number of vowels found
Examples:
>>> count_vowels("Hello World")
3
>>> count_vowels("AEIOU")
5
"""
# Define vowels (both cases for easy comparison)
vowels = "aeiouAEIOU"
# Initialize counter
count = 0
# Iterate through each character in the text
for char in text:
# Check if character is a vowel
if char in vowels:
count += 1
return count
# Alternative: More Pythonic one-liner
def count_vowels_v2(text):
"""One-liner using generator expression and sum."""
return sum(1 for char in text.lower() if char in 'aeiou')
# Alternative: Using count method
def count_vowels_v3(text):
"""Using str.count() for each vowel."""
text_lower = text.lower()
return sum(text_lower.count(v) for v in 'aeiou')
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
("Hello World", 3), # e, o, o
("AEIOU", 5), # all uppercase vowels
("rhythm", 0), # no vowels
("", 0), # empty string
("AaEeIiOoUu", 10), # mixed case
]
print("Testing count_vowels:")
for text, expected in test_cases:
result = count_vowels(text)
status = "✓" if result == expected else "✗"
print(f' {status} count_vowels("{text}") = {result} (expected {expected})')Test Output:
Testing count_vowels:
✓ count_vowels("Hello World") = 3 (expected 3)
✓ count_vowels("AEIOU") = 5 (expected 5)
✓ count_vowels("rhythm") = 0 (expected 0)
✓ count_vowels("") = 0 (expected 0)
✓ count_vowels("AaEeIiOoUu") = 10 (expected 10)
Key points:
- Handle both uppercase and lowercase
- Use
inoperator for membership test - Simple counter pattern
Q2.2 - Word Frequency Dictionary (10 points)
Write a function word_frequency(words) that:
- Input: list of words
- Return: dictionary with word counts
- Example:
['a', 'b', 'a']→{'a': 2, 'b': 1}
💡 Click to View Verified Answer
def word_frequency(words):
"""
Count frequency of each word in a list.
Args:
words: List of words (strings)
Returns:
dict: Dictionary mapping each word to its count
Examples:
>>> word_frequency(['a', 'b', 'a'])
{'a': 2, 'b': 1}
"""
# Initialize empty frequency dictionary
freq = {}
# Count each word
for word in words:
if word in freq:
# Word seen before - increment count
freq[word] += 1
else:
# First occurrence - initialize count to 1
freq[word] = 1
return freq
# Alternative: Using dict.get()
def word_frequency_v2(words):
"""Using get() method to simplify logic."""
freq = {}
for word in words:
# get(key, default) returns default if key doesn't exist
freq[word] = freq.get(word, 0) + 1
return freq
# Alternative: Using collections.Counter
def word_frequency_v3(words):
"""Using Counter from collections (most Pythonic)."""
from collections import Counter
return dict(Counter(words))
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
['a', 'b', 'a'],
['hello', 'world', 'hello', 'hello'],
[],
['single'],
]
print("Testing word_frequency:")
for words in test_cases:
result = word_frequency(words)
print(f" {words} → {result}")Test Output:
Testing word_frequency:
['a', 'b', 'a'] → {'a': 2, 'b': 1}
['hello', 'world', 'hello', 'hello'] → {'hello': 3, 'world': 1}
[] → {}
['single'] → {'single': 1}
Key techniques:
- Check if key exists before incrementing
- Alternative: use
dict.get(key, default) - Best practice: use
collections.Counter
Q2.3 - Multiplication Table (10 points)
Write a function multiplication_table(n) that:
- Prints an n×n multiplication table
- Format:
1 x 1 = 1
💡 Click to View Verified Answer
def multiplication_table(n):
"""
Print an n×n multiplication table.
Args:
n: Size of the table (positive integer)
Example output for n=3:
1 x 1 = 1
1 x 2 = 2
1 x 3 = 3
2 x 1 = 2
...
3 x 3 = 9
"""
# Validate input
if n <= 0:
print("Please provide a positive integer.")
return
# Outer loop: rows (first multiplier)
for i in range(1, n + 1):
# Inner loop: columns (second multiplier)
for j in range(1, n + 1):
# Calculate product
product = i * j
# Print formatted result using f-string
print(f"{i} x {j} = {product}")
# Alternative: Compact table format
def multiplication_table_compact(n):
"""Print table in grid format."""
for i in range(1, n + 1):
row = ""
for j in range(1, n + 1):
row += f"{i*j:4}" # 4-character width for alignment
print(row)
# ===== Test Cases =====
if __name__ == "__main__":
print("3x3 Multiplication Table:")
print("-" * 20)
multiplication_table(3)
print("\n3x3 Compact Format:")
print("-" * 20)
multiplication_table_compact(3)Test Output:
3x3 Multiplication Table:
--------------------
1 x 1 = 1
1 x 2 = 2
1 x 3 = 3
2 x 1 = 2
2 x 2 = 4
2 x 3 = 6
3 x 1 = 3
3 x 2 = 6
3 x 3 = 9
3x3 Compact Format:
--------------------
1 2 3
2 4 6
3 6 9
Key concepts:
- Nested loops for 2D iteration
range(1, n+1)to start from 1- f-strings for formatted output
Q2.4 - Find Max and Min (10 points)
Write a function find_max_min(numbers) that:
- Input: list of numbers
- Return: tuple (maximum, minimum, difference)
- Handle empty list by returning (None, None, None)
💡 Click to View Verified Answer
def find_max_min(numbers):
"""
Find maximum, minimum, and their difference in a list.
Args:
numbers: List of numeric values
Returns:
tuple: (maximum, minimum, difference) or (None, None, None) if empty
Examples:
>>> find_max_min([5, 2, 8, 1, 9])
(9, 1, 8)
>>> find_max_min([])
(None, None, None)
"""
# Handle empty list edge case
# IMPORTANT: Check this first to avoid errors with min()/max()
if not numbers: # Empty list is falsy in Python
return (None, None, None)
# Find maximum and minimum using built-in functions
maximum = max(numbers)
minimum = min(numbers)
# Calculate difference (range of values)
difference = maximum - minimum
return (maximum, minimum, difference)
# Alternative: Without using built-in min/max
def find_max_min_manual(numbers):
"""Manual implementation without min()/max()."""
if not numbers:
return (None, None, None)
# Initialize with first element
maximum = numbers[0]
minimum = numbers[0]
# Iterate through remaining elements
for num in numbers[1:]:
if num > maximum:
maximum = num
if num < minimum:
minimum = num
return (maximum, minimum, maximum - minimum)
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
[5, 2, 8, 1, 9], # Normal case
[3], # Single element
[], # Empty list
[-5, -2, -8, -1], # Negative numbers
[1, 1, 1, 1], # All same
]
print("Testing find_max_min:")
for nums in test_cases:
result = find_max_min(nums)
print(f" {nums} → max={result[0]}, min={result[1]}, diff={result[2]}")Test Output:
Testing find_max_min:
[5, 2, 8, 1, 9] → max=9, min=1, diff=8
[3] → max=3, min=3, diff=0
[] → max=None, min=None, diff=None
[-5, -2, -8, -1] → max=-1, min=-8, diff=7
[1, 1, 1, 1] → max=1, min=1, diff=0
Key points:
- ALWAYS handle empty list first
- Use built-in
min()andmax()for efficiency - Return a tuple, not a list
Q2.5 - Factorial (Recursive) (10 points)
Write a function factorial(n) that:
- Calculates n! recursively
- Handle: 0! = 1, negative returns None
💡 Click to View Verified Answer
def factorial(n):
"""
Calculate factorial of n using recursion.
Factorial definition:
- n! = n × (n-1) × (n-2) × ... × 2 × 1
- 0! = 1 (by definition)
- Negative numbers: undefined (return None)
Args:
n: Non-negative integer
Returns:
int: n! or None for negative input
Examples:
>>> factorial(5)
120
>>> factorial(0)
1
"""
# Handle negative input
if n < 0:
return None
# Base case: 0! = 1 and 1! = 1
if n == 0 or n == 1:
return 1
# Recursive case: n! = n × (n-1)!
return n * factorial(n - 1)
# Trace for factorial(4):
# factorial(4) = 4 × factorial(3)
# = 4 × (3 × factorial(2))
# = 4 × (3 × (2 × factorial(1)))
# = 4 × (3 × (2 × 1))
# = 4 × (3 × 2)
# = 4 × 6
# = 24
# Alternative: Iterative version (no recursion)
def factorial_iterative(n):
"""Calculate factorial using iteration."""
if n < 0:
return None
result = 1
for i in range(2, n + 1):
result *= i
return result
# ===== Test Cases =====
if __name__ == "__main__":
test_cases = [
(0, 1), # 0! = 1
(1, 1), # 1! = 1
(5, 120), # 5! = 120
(10, 3628800),
(-5, None), # Negative
]
print("Testing factorial:")
for n, expected in test_cases:
result = factorial(n)
status = "✓" if result == expected else "✗"
print(f" {status} factorial({n}) = {result} (expected {expected})")Test Output:
Testing factorial:
✓ factorial(0) = 1 (expected 1)
✓ factorial(1) = 1 (expected 1)
✓ factorial(5) = 120 (expected 120)
✓ factorial(10) = 3628800 (expected 3628800)
✓ factorial(-5) = None (expected None)
Recursion components:
- Base case: stops recursion (n=0 or n=1)
- Recursive case: breaks problem into smaller subproblem
- Progress: n decreases each call, eventually reaching base case
Question 3: Pandas & SVM/Random Forest (25 points)
Part A: Data Preprocessing (15 points)
Given this DataFrame:
import pandas as pd
from sklearn.preprocessing import LabelEncoder, StandardScaler
data = {
'Age': [25, 30, None, 35, 40],
'Income': [30000, 50000, 45000, None, 60000],
'Education': ['High School', 'Bachelor', 'Master', 'PhD', 'Bachelor'],
'Purchased': ['No', 'Yes', 'Yes', 'No', 'Yes']
}
df = pd.DataFrame(data)Q3.A1 (5 points) Fill missing Age with median, missing Income with mean.
💡 Click to View Verified Answer
import pandas as pd
# Create the DataFrame
data = {
'Age': [25, 30, None, 35, 40],
'Income': [30000, 50000, 45000, None, 60000],
'Education': ['High School', 'Bachelor', 'Master', 'PhD', 'Bachelor'],
'Purchased': ['No', 'Yes', 'Yes', 'No', 'Yes']
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
print()
# Calculate statistics before filling
age_median = df['Age'].median() # Median of [25, 30, 35, 40] = 32.5
income_mean = df['Income'].mean() # Mean of [30000, 50000, 45000, 60000] = 46250
print(f"Age median (excluding NaN): {age_median}")
print(f"Income mean (excluding NaN): {income_mean}")
print()
# Fill missing values
# Method 1: Using fillna with inplace
df['Age'].fillna(age_median, inplace=True)
df['Income'].fillna(income_mean, inplace=True)
# Method 2: Using assignment (alternative)
# df['Age'] = df['Age'].fillna(df['Age'].median())
# df['Income'] = df['Income'].fillna(df['Income'].mean())
print("After filling missing values:")
print(df)Calculations:
- Age values (excluding NaN): [25, 30, 35, 40]
- Age median: (30 + 35) / 2 = 32.5
- Income values (excluding NaN): [30000, 50000, 45000, 60000]
- Income mean: (30000 + 50000 + 45000 + 60000) / 4 = 46250
Result:
- Row 2: Age filled with 32.5
- Row 3: Income filled with 46250.0
Q3.A2 (5 points) Encode 'Education' using LabelEncoder. Show the mapping.
💡 Click to View Verified Answer
from sklearn.preprocessing import LabelEncoder
# Create LabelEncoder instance
le = LabelEncoder()
# Fit and transform the Education column
df['Education_Encoded'] = le.fit_transform(df['Education'])
print("Encoding result:")
print(df[['Education', 'Education_Encoded']])
print()
# Show the mapping (classes are sorted alphabetically)
print("LabelEncoder mapping:")
for i, label in enumerate(le.classes_):
print(f" '{label}' → {i}")LabelEncoder sorts alphabetically then assigns 0, 1, 2, ...:
| Original | Encoded |
|---|---|
| Bachelor | 0 |
| High School | 1 |
| Master | 2 |
| PhD | 3 |
Encoded column: [1, 0, 2, 3, 0]
Important: LabelEncoder assigns integers based on alphabetical order, not order of appearance!
Q3.A3 (5 points) When should you use StandardScaler vs MinMaxScaler?
💡 Click to View Answer
| Scaler | Formula | Output | Best For |
|---|---|---|---|
| StandardScaler | (x - mean) / std | Mean=0, Std=1 | SVM, Logistic Regression, data with outliers |
| MinMaxScaler | (x - min) / (max - min) | [0, 1] | Neural Networks, KNN, image data |
Use StandardScaler when:
- Data is approximately normally distributed
- You want to preserve outlier information
- Using algorithms like SVM, Linear Regression
Use MinMaxScaler when:
- You need bounded output (0 to 1)
- Working with neural networks or image data
- Outliers are not a concern
Quick rule:
- SVM, Linear models → StandardScaler
- Neural networks, KNN → MinMaxScaler
Part B: SVM & Random Forest Theory (10 points)
Q3.B1 (5 points) Explain the "kernel trick" in SVM.
💡 Click to View Answer
Kernel Trick Explanation:
Problem: Some data is not linearly separable in its original space.
Solution: The kernel trick transforms data into a higher-dimensional space where it becomes linearly separable.
How it works:
- Original 2D data might have circular boundaries (can't draw a straight line)
- Transform to 3D using a kernel function
- In 3D, a flat plane can now separate the classes
- The "trick": compute this efficiently without actually computing the transformation
Common kernels:
| Kernel | Use Case |
|---|---|
| Linear | Already linearly separable |
| RBF (Radial Basis Function) | Default choice, works well for most cases |
| Polynomial | Data with polynomial relationships |
Example in code:
from sklearn.svm import SVC
# Linear kernel
model_linear = SVC(kernel='linear')
# RBF kernel (default)
model_rbf = SVC(kernel='rbf')
# Polynomial kernel
model_poly = SVC(kernel='poly', degree=3)Q3.B2 (5 points) What is "bagging" in Random Forest? Why does it help?
💡 Click to View Answer
Bagging (Bootstrap Aggregating):
Process:
- Create multiple random subsets of training data (with replacement)
- Train a separate decision tree on each subset
- Combine predictions:
- Classification: majority voting
- Regression: average
Why it helps:
-
Reduces Overfitting
- Each tree sees different data
- Individual tree errors cancel out
- Ensemble is more robust
-
Reduces Variance
- Averaging many predictions is more stable
- Less sensitive to noise in training data
-
Handles Outliers Better
- Outliers only affect some trees, not all
- Their influence is diluted in the ensemble
-
Better Generalization
- Collective wisdom outperforms single tree
- Works well on unseen data
Analogy: Like asking 100 doctors for diagnosis instead of 1 - the collective opinion is usually more reliable.
Question 4: Naive Bayes & Decision Tree (25 points)
Part A: Naive Bayes Calculation (15 points)
Dataset: Email classification
| Contains "Free" | Contains "Winner" | Spam? | |
|---|---|---|---|
| 1 | Yes | Yes | Spam |
| 2 | Yes | No | Spam |
| 3 | No | Yes | Spam |
| 4 | No | No | Not Spam |
| 5 | Yes | No | Not Spam |
| 6 | No | No | Not Spam |
Q4.A1 (10 points) A new email contains "Free" but not "Winner". Calculate P(Spam|Free=Yes, Winner=No).
💡 Click to View Verified Answer
Naive Bayes Formula: $P(Class|Features) \propto P(Class) \times \prod P(Feature|Class)$
Step 1: Calculate Prior Probabilities
| Class | Count | P(Class) |
|---|---|---|
| Spam | 3 (emails 1,2,3) | 3/6 = 0.5 |
| Not Spam | 3 (emails 4,5,6) | 3/6 = 0.5 |
Step 2: Calculate Likelihoods
For Spam emails (1, 2, 3):
- P(Free=Yes | Spam) = 2/3 (emails 1, 2 have Free)
- P(Winner=No | Spam) = 1/3 (only email 2 has Winner=No)
For Not Spam emails (4, 5, 6):
- P(Free=Yes | Not Spam) = 1/3 (only email 5)
- P(Winner=No | Not Spam) = 3/3 = 1 (all three)
Step 3: Calculate Unnormalized Posteriors
$P(Spam|evidence) \propto P(Spam) \times P(Free=Yes|Spam) \times P(Winner=No|Spam)$ $= 0.5 \times \frac{2}{3} \times \frac{1}{3} = 0.5 \times 0.667 \times 0.333 = 0.111$
$P(NotSpam|evidence) \propto 0.5 \times \frac{1}{3} \times 1 = 0.167$
Step 4: Normalize
$P(Spam) = \frac{0.111}{0.111 + 0.167} = \frac{0.111}{0.278} = 0.40$
Answer: P(Spam | Free=Yes, Winner=No) = 0.40 = 40%
Prediction: NOT SPAM (probability < 50%)
Q4.A2 (5 points) What is the "naive" assumption in Naive Bayes? When might it fail?
💡 Click to View Answer
The "Naive" Assumption:
- Features are conditionally independent given the class
- P(A, B | Class) = P(A | Class) × P(B | Class)
- Each feature contributes independently to the prediction
When it fails:
-
Correlated features
- Example: "Free" and "Prize" often appear together in spam
- Treating them as independent overcounts their combined effect
-
Redundant features
- Example: Having both "temperature in °C" and "temperature in °F"
- These are perfectly correlated, violating independence
-
Feature interactions matter
- Example: Medical diagnosis where symptom combinations are important
- Symptom A alone is harmless, but A+B together indicates disease
Despite this limitation: Naive Bayes often works surprisingly well in practice, especially for:
- Text classification
- Spam detection
- Sentiment analysis
Part B: Information Gain (10 points)
Q4.B1 (10 points) Calculate Information Gain for the "Contains Free" feature.
Original dataset: 3 Spam, 3 Not Spam
💡 Click to View Verified Answer
Entropy Formula: $H(S) = -\sum p_i \log_2(p_i)$
Step 1: Parent Entropy (3 Spam, 3 Not Spam)
$H(parent) = -0.5 \log_2(0.5) - 0.5 \log_2(0.5)$ $= -0.5 \times (-1) - 0.5 \times (-1)$ $= 0.5 + 0.5 = 1.0$
(Maximum entropy for binary classification = 1.0)
Step 2: Split by "Contains Free"
Free=Yes (3 emails: 2 Spam, 1 Not Spam): $H = -\frac{2}{3} \log_2(\frac{2}{3}) - \frac{1}{3} \log_2(\frac{1}{3})$ $= -0.667 \times (-0.585) - 0.333 \times (-1.585)$ $= 0.390 + 0.528 = 0.918$
Free=No (3 emails: 1 Spam, 2 Not Spam): $H = -\frac{1}{3} \log_2(\frac{1}{3}) - \frac{2}{3} \log_2(\frac{2}{3})$ $= 0.528 + 0.390 = 0.918$
Step 3: Weighted Average Entropy $H(children) = \frac{3}{6} \times 0.918 + \frac{3}{6} \times 0.918 = 0.918$
Step 4: Information Gain $IG = H(parent) - H(children) = 1.0 - 0.918 = 0.082$
Answer: Information Gain = 0.082 bits
Interpretation: "Contains Free" provides a small amount of information for classification. Higher IG would indicate a better split.
🏁 End of Exam
| Question | Topic | Points |
|---|---|---|
| Q1 | Python Output Analysis | 20 |
| Q2 | Code Writing (choose 3/5) | 30 |
| Q3 | Pandas & SVM/Random Forest | 25 |
| Q4 | Naive Bayes & Decision Tree | 25 |
| Total | 100 |
📝 Key Formulas Reference
| Concept | Formula |
|---|---|
| Gini Index | 1 - Σ(pᵢ²) |
| Entropy | -Σ pᵢ log₂(pᵢ) |
| Info Gain | H(parent) - Σ weighted H(children) |
| Bayes | P(A|B) ∝ P(B|A) × P(A) |
| Z-score | (x - μ) / σ |
| MinMax | (x - min) / (max - min) |
All code verified and tested. Show your work for partial credit. Good luck!