ABW505 Mock Exam 1 - Python & Machine Learning

📋 Exam Information

Item	Details
Total Points	100
Time Allowed	90 minutes
Format	Closed book, calculator allowed
Structure	Q1 (20pts) + Q2 (30pts, choose 3/5) + Q3 (25pts) + Q4 (25pts)

Question 1: Python Output Analysis (20 points)

Answer ALL questions. Determine exact output.

Q1.1 (5 points)

x = 15
y = 4
print((x // y) ** 2 + x % y)

💡 Click to View Answer & Explanation

Step-by-step breakdown:

# Given values
x = 15
y = 4
 
# Step 1: Floor division x // y
# 15 // 4 = 3 (integer part of 15/4 = 3.75)
floor_result = 15 // 4  # = 3
 
# Step 2: Modulo x % y  
# 15 % 4 = 3 (remainder when 15 divided by 4)
# 15 = 4 × 3 + 3, so remainder is 3
mod_result = 15 % 4  # = 3
 
# Step 3: Power (floor_result) ** 2
# 3 ** 2 = 9
power_result = 3 ** 2  # = 9
 
# Step 4: Addition
# 9 + 3 = 12
final = 9 + 3  # = 12

Answer: 12

Key operators explained:

Operator	Name	Example
`//`	Floor division	15 // 4 = 3
`%`	Modulo	15 % 4 = 3
`**`	Exponentiation	3 ** 2 = 9

Q1.2 (5 points)

numbers = [10, 20, 30, 40, 50]
numbers[1:4] = [100]
print(len(numbers))
print(numbers[2])

💡 Click to View Answer & Explanation

Step-by-step breakdown:

# Original list
numbers = [10, 20, 30, 40, 50]
# Indices:   0   1   2   3   4
 
# Slice assignment: numbers[1:4] = [100]
# This replaces elements at indices 1, 2, 3 with a single element 100
# Before: [10, 20, 30, 40, 50]
#              ^^^^^^^^^^^^ <- indices 1:4 (elements 20, 30, 40)
# After:  [10, 100, 50]
#              ^^^ <- replaced with single element
 
# Result after slice assignment
# numbers = [10, 100, 50]
# Indices:    0    1   2
 
# len(numbers) = 3 (was 5, replaced 3 elements with 1)
# numbers[2] = 50 (third element)

Answers:

len(numbers) → 3
numbers[2] → 50

Important concept: Slice assignment can change list size! Here we replaced 3 elements (indices 1, 2, 3) with 1 element, reducing length from 5 to 3.

Q1.3 (5 points)

def mystery(a, b=5, c=10):
    return a * 2 + b - c
 
result = mystery(3, c=4)
print(result)

💡 Click to View Answer & Explanation

Step-by-step breakdown:

# Function definition
def mystery(a, b=5, c=10):
    # a: required parameter
    # b: optional, default = 5
    # c: optional, default = 10
    return a * 2 + b - c
 
# Function call: mystery(3, c=4)
# a = 3 (positional argument, first position)
# b = 5 (uses default value, NOT provided in call)
# c = 4 (keyword argument, overrides default of 10)
 
# Calculation:
# a * 2 + b - c
# = 3 * 2 + 5 - 4
# = 6 + 5 - 4
# = 7

Answer: 7

Key concept: Keyword arguments (c=4) allow you to skip over parameters with defaults. Here b uses its default value of 5.

Q1.4 (5 points)

data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
total = 0
for key in data:
    total += sum(data[key])
print(total)

💡 Click to View Answer & Explanation

Step-by-step breakdown:

# Dictionary with lists as values
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
 
total = 0
 
# Iterating over a dictionary gives KEYS, not values
for key in data:  # key will be 'A', then 'B'
    
    # First iteration: key = 'A'
    # data['A'] = [1, 2, 3]
    # sum([1, 2, 3]) = 6
    # total = 0 + 6 = 6
    
    # Second iteration: key = 'B'
    # data['B'] = [4, 5, 6]
    # sum([4, 5, 6]) = 15
    # total = 6 + 15 = 21
    
    total += sum(data[key])
 
print(total)  # 21

Answer: 21

Calculation summary:

sum([1, 2, 3]) = 6
sum([4, 5, 6]) = 15
Total = 6 + 15 = 21

Question 2: Code Writing (30 points)

Choose 3 out of 5 questions. Each worth 10 points.

Q2.1 - Grade Calculator (10 points)

Write a function grade_calculator(score) that:

Returns letter grade: 90+ → "A", 80+ → "B", 70+ → "C", 60+ → "D", <60 → "F"
Returns "Invalid" for scores < 0 or > 100

💡 Click to View Verified Answer

def grade_calculator(score):
    """
    Convert numeric score to letter grade.
    
    Args:
        score: Numeric score (expected range: 0-100)
        
    Returns:
        str: Letter grade (A/B/C/D/F) or "Invalid" for out-of-range scores
    
    Examples:
        >>> grade_calculator(95)
        'A'
        >>> grade_calculator(-5)
        'Invalid'
    """
    # STEP 1: Validate input range FIRST
    # Must check invalid cases before checking grade ranges
    if score < 0 or score > 100:
        return "Invalid"
    
    # STEP 2: Check grades from highest to lowest
    # Using elif ensures only one condition matches
    if score >= 90:
        return "A"  # 90-100
    elif score >= 80:
        return "B"  # 80-89
    elif score >= 70:
        return "C"  # 70-79
    elif score >= 60:
        return "D"  # 60-69
    else:
        return "F"  # 0-59
 
 
# ===== Test Cases =====
if __name__ == "__main__":
    test_cases = [
        (95, "A"),
        (85, "B"),
        (73, "C"),
        (65, "D"),
        (45, "F"),
        (-5, "Invalid"),
        (105, "Invalid"),
        (100, "A"),  # Edge case: exactly 100
        (0, "F"),    # Edge case: exactly 0
    ]
    
    print("Testing grade_calculator:")
    for score, expected in test_cases:
        result = grade_calculator(score)
        status = "✓" if result == expected else "✗"
        print(f"  {status} grade_calculator({score}) = {result} (expected {expected})")

Test Output:

Testing grade_calculator:
  ✓ grade_calculator(95) = A (expected A)
  ✓ grade_calculator(85) = B (expected B)
  ✓ grade_calculator(73) = C (expected C)
  ✓ grade_calculator(65) = D (expected D)
  ✓ grade_calculator(45) = F (expected F)
  ✓ grade_calculator(-5) = Invalid (expected Invalid)
  ✓ grade_calculator(105) = Invalid (expected Invalid)
  ✓ grade_calculator(100) = A (expected A)
  ✓ grade_calculator(0) = F (expected F)

Common mistakes to avoid:

Not validating input range first
Using multiple if statements instead of elif
Checking in wrong order (e.g., 60+ before 90+)

Q2.2 - Remove Duplicates (10 points)

Write a function remove_duplicates(lst) that:

Removes duplicates from a list
Preserves the order of first occurrence
Example: [1, 2, 2, 3, 1, 4] → [1, 2, 3, 4]

💡 Click to View Verified Answer

def remove_duplicates(lst):
    """
    Remove duplicate elements while preserving order of first occurrence.
    
    Args:
        lst: Input list with possible duplicates
        
    Returns:
        list: New list with duplicates removed, order preserved
        
    Examples:
        >>> remove_duplicates([1, 2, 2, 3, 1, 4])
        [1, 2, 3, 4]
    """
    # Track elements we've already seen
    seen = []
    
    # Iterate through original list
    for item in lst:
        # Only add to result if not seen before
        if item not in seen:
            seen.append(item)
    
    return seen
 
 
# Alternative approach using dictionary (Python 3.7+ preserves order)
def remove_duplicates_v2(lst):
    """
    Remove duplicates using dict.fromkeys() - more efficient for large lists.
    Works because dictionaries preserve insertion order in Python 3.7+.
    """
    return list(dict.fromkeys(lst))
 
 
# ===== Test Cases =====
if __name__ == "__main__":
    test_cases = [
        [1, 2, 2, 3, 1, 4],        # Basic case
        [5, 5, 5, 5],              # All duplicates
        [1, 2, 3, 4],              # No duplicates
        [],                        # Empty list
        ['a', 'b', 'a', 'c'],      # Strings
    ]
    
    print("Testing remove_duplicates:")
    for test in test_cases:
        result = remove_duplicates(test)
        print(f"  {test} → {result}")

Test Output:

Testing remove_duplicates:
  [1, 2, 2, 3, 1, 4] → [1, 2, 3, 4]
  [5, 5, 5, 5] → [5]
  [1, 2, 3, 4] → [1, 2, 3, 4]
  [] → []
  ['a', 'b', 'a', 'c'] → ['a', 'b', 'c']

Why not use set()? Sets don't preserve order! list(set([1, 2, 2, 3, 1, 4])) might give [1, 2, 3, 4] but order is NOT guaranteed.

Q2.3 - Fibonacci Sequence (10 points)

Write a function fibonacci(n) that:

Returns the first n Fibonacci numbers as a list
Sequence: 0, 1, 1, 2, 3, 5, 8, 13...

💡 Click to View Verified Answer

def fibonacci(n):
    """
    Generate the first n Fibonacci numbers.
    
    Fibonacci sequence: Each number is the sum of the two preceding ones.
    Starts with 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
    
    Args:
        n: Number of Fibonacci numbers to generate (non-negative integer)
        
    Returns:
        list: First n Fibonacci numbers
        
    Examples:
        >>> fibonacci(5)
        [0, 1, 1, 2, 3]
        >>> fibonacci(0)
        []
    """
    # Handle edge cases
    if n <= 0:
        return []  # No numbers requested
    if n == 1:
        return [0]  # Only first number
    
    # Initialize with first two Fibonacci numbers
    result = [0, 1]
    
    # Generate remaining numbers
    for i in range(2, n):
        # Each new number = sum of last two numbers
        # Using negative indexing: result[-1] is last, result[-2] is second-to-last
        next_num = result[-1] + result[-2]
        result.append(next_num)
    
    return result
 
 
# ===== Test Cases =====
if __name__ == "__main__":
    test_cases = [0, 1, 2, 5, 8, 10]
    
    print("Testing fibonacci:")
    for n in test_cases:
        result = fibonacci(n)
        print(f"  fibonacci({n}) = {result}")

Test Output:

Testing fibonacci:
  fibonacci(0) = []
  fibonacci(1) = [0]
  fibonacci(2) = [0, 1]
  fibonacci(5) = [0, 1, 1, 2, 3]
  fibonacci(8) = [0, 1, 1, 2, 3, 5, 8, 13]
  fibonacci(10) = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

How it works:

Position:  0   1   2   3   4   5   6   7
Value:     0   1   1   2   3   5   8   13
                 ↑   ↑
           0+1=1  1+1=2  1+2=3  2+3=5  3+5=8

Q2.4 - Prime Number Check (10 points)

Write a function is_prime(num) that:

Returns True if num is prime, False otherwise
Handle edge cases (num < 2)

💡 Click to View Verified Answer

def is_prime(num):
    """
    Check if a number is prime.
    
    A prime number is a natural number greater than 1 that has no positive 
    divisors other than 1 and itself.
    
    Args:
        num: Integer to check
        
    Returns:
        bool: True if prime, False otherwise
        
    Examples:
        >>> is_prime(7)
        True
        >>> is_prime(12)
        False
    """
    # Numbers less than 2 are not prime by definition
    # This handles 0, 1, and negative numbers
    if num < 2:
        return False
    
    # 2 is the only even prime number
    if num == 2:
        return True
    
    # All other even numbers are not prime
    # (They're divisible by 2)
    if num % 2 == 0:
        return False
    
    # Check odd divisors from 3 up to √num
    # Why √num? If num = a × b, at least one of a,b must be ≤ √num
    # If no divisor found up to √num, num is prime
    for i in range(3, int(num ** 0.5) + 1, 2):  # Step by 2 (odd numbers only)
        if num % i == 0:
            return False  # Found a divisor, not prime
    
    return True  # No divisors found, it's prime
 
 
# ===== Test Cases =====
if __name__ == "__main__":
    test_cases = [
        (1, False),   # Not prime (less than 2)
        (2, True),    # Prime (smallest prime)
        (3, True),    # Prime
        (4, False),   # Not prime (2 × 2)
        (7, True),    # Prime
        (9, False),   # Not prime (3 × 3)
        (11, True),   # Prime
        (25, False),  # Not prime (5 × 5)
        (29, True),   # Prime
        (-5, False),  # Negative, not prime
    ]
    
    print("Testing is_prime:")
    for num, expected in test_cases:
        result = is_prime(num)
        status = "✓" if result == expected else "✗"
        print(f"  {status} is_prime({num}) = {result}")

Test Output:

Testing is_prime:
  ✓ is_prime(1) = False
  ✓ is_prime(2) = True
  ✓ is_prime(3) = True
  ✓ is_prime(4) = False
  ✓ is_prime(7) = True
  ✓ is_prime(9) = False
  ✓ is_prime(11) = True
  ✓ is_prime(25) = False
  ✓ is_prime(29) = True
  ✓ is_prime(-5) = False

Optimization: Checking up to √n instead of n reduces time complexity from O(n) to O(√n).

Q2.5 - Tuple Statistics (10 points)

Write a function tuple_stats(data) that:

Input: tuple of numbers
Return: tuple of (min, max, average rounded to 2 decimals)

💡 Click to View Verified Answer

def tuple_stats(data):
    """
    Calculate statistics for a tuple of numbers.
    
    Args:
        data: Tuple of numeric values
        
    Returns:
        tuple: (minimum, maximum, average) where average is rounded to 2 decimals
        
    Raises:
        ValueError: If tuple is empty
        
    Examples:
        >>> tuple_stats((10, 20, 30, 40))
        (10, 40, 25.0)
    """
    # Handle empty tuple edge case
    if len(data) == 0:
        raise ValueError("Cannot compute stats for empty tuple")
    
    # Calculate statistics using built-in functions
    minimum = min(data)        # Smallest value
    maximum = max(data)        # Largest value
    average = sum(data) / len(data)  # Arithmetic mean
    
    # Round average to 2 decimal places
    average = round(average, 2)
    
    # Return as tuple (note: using parentheses to make it clear)
    return (minimum, maximum, average)
 
 
# ===== Test Cases =====
if __name__ == "__main__":
    test_cases = [
        (10, 20, 30, 40),      # Even spread
        (5, 15, 25),           # Odd count
        (7,),                  # Single element
        (1, 2, 3, 4, 5, 6, 7, 8, 9, 10),  # Larger tuple
    ]
    
    print("Testing tuple_stats:")
    for data in test_cases:
        result = tuple_stats(data)
        print(f"  tuple_stats({data})")
        print(f"    → (min={result[0]}, max={result[1]}, avg={result[2]})")

Test Output:

Testing tuple_stats:
  tuple_stats((10, 20, 30, 40))
    → (min=10, max=40, avg=25.0)
  tuple_stats((5, 15, 25))
    → (min=5, max=25, avg=15.0)
  tuple_stats((7,))
    → (min=7, max=7, avg=7.0)
  tuple_stats((1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
    → (min=1, max=10, avg=5.5)

Question 3: Pandas & ML Basics (25 points)

Part A: Theory (10 points)

Q3.A1 (5 points) Explain the difference between fit() and predict() in scikit-learn.

💡 Click to View Answer

Method	Purpose	When Called	What It Does
`fit()`	Train the model	Once, on training data	Learns patterns/parameters from data
`predict()`	Use the model	On test/new data	Applies learned patterns to make predictions

Workflow example:

# Step 1: Create model
model = DecisionTreeClassifier()
 
# Step 2: Train model (learn from training data)
model.fit(X_train, y_train)  # Learns patterns
 
# Step 3: Use model (apply to new data)
predictions = model.predict(X_test)  # Makes predictions

Analogy:

fit() = studying for an exam
predict() = taking the exam

Q3.A2 (5 points) Why do we need train-test split? Why not use all data for training?

💡 Click to View Answer

Why train-test split is essential:

Evaluate on unseen data: We need to test how the model performs on data it hasn't seen during training.
Detect overfitting: If we train and test on the same data, the model might just memorize the answers (overfitting). Train-test split reveals if the model generalizes well.
Simulate real-world usage: In production, the model will encounter new, unseen data. Testing on held-out data simulates this.
Get honest performance estimate: Training accuracy is often misleadingly high; test accuracy gives a realistic measure.

What happens without split:

Model could achieve 100% accuracy on training data
But fail completely on new data
No way to detect this problem until deployment

Typical split ratios:

80/20 (training/test)
70/30 (training/test)

Part B: Pandas Code (15 points)

Given this DataFrame:

import pandas as pd
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 30, 35, None, 28],
    'Salary': [50000, 60000, 75000, 65000, None],
    'Department': ['IT', 'HR', 'IT', 'Finance', 'HR']
}
df = pd.DataFrame(data)

Q3.B1 (5 points) Fill missing Age with mean, missing Salary with 55000.

💡 Click to View Verified Answer

import pandas as pd
 
# Create the DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 30, 35, None, 28],
    'Salary': [50000, 60000, 75000, 65000, None],
    'Department': ['IT', 'HR', 'IT', 'Finance', 'HR']
}
df = pd.DataFrame(data)
 
print("Before filling:")
print(df)
print()
 
# Method 1: Using fillna with inplace=True
# Fill missing Age with mean
age_mean = df['Age'].mean()  # Calculate mean first (ignores NaN)
print(f"Age mean (excluding NaN): {age_mean}")  # = 29.5
df['Age'].fillna(age_mean, inplace=True)
 
# Fill missing Salary with 55000
df['Salary'].fillna(55000, inplace=True)
 
print("\nAfter filling:")
print(df)

Output:

Before filling:
      Name   Age   Salary Department
0    Alice  25.0  50000.0         IT
1      Bob  30.0  60000.0         HR
2  Charlie  35.0  75000.0         IT
3    David   NaN  65000.0    Finance
4      Eve  28.0      NaN         HR

Age mean (excluding NaN): 29.5

After filling:
      Name   Age   Salary Department
0    Alice  25.0  50000.0         IT
1      Bob  30.0  60000.0         HR
2  Charlie  35.0  75000.0         IT
3    David  29.5  65000.0    Finance
4      Eve  28.0  55000.0         HR

Note: mean() automatically ignores NaN values when calculating.

Q3.B2 (5 points) Calculate average salary by Department.

💡 Click to View Verified Answer

# Group by Department and calculate mean of Salary
avg_salary_by_dept = df.groupby('Department')['Salary'].mean()
 
print("Average Salary by Department:")
print(avg_salary_by_dept)

Output (after filling missing values):

Average Salary by Department:
Department
Finance    65000.0
HR         57500.0
IT         62500.0
Name: Salary, dtype: float64

Calculation breakdown:

Finance: 65000 (only David)
HR: (60000 + 55000) / 2 = 57500 (Bob + Eve)
IT: (50000 + 75000) / 2 = 62500 (Alice + Charlie)

Q3.B3 (5 points) Select IT employees with Age > 26.

💡 Click to View Verified Answer

# Filter with multiple conditions
# IMPORTANT: Use & for AND, | for OR
# IMPORTANT: Wrap each condition in parentheses
result = df[(df['Department'] == 'IT') & (df['Age'] > 26)]
 
print("IT employees with Age > 26:")
print(result)

Output:

IT employees with Age > 26:
      Name   Age   Salary Department
2  Charlie  35.0  75000.0         IT

Syntax rules for pandas filtering:

Use & instead of and
Use | instead of or
Use ~ instead of not
Wrap each condition in parentheses

Wrong: df[df['A'] == 1 and df['B'] == 2] Correct: df[(df['A'] == 1) & (df['B'] == 2)]

Question 4: Decision Tree & Naive Bayes (25 points)

Part A: Theory (10 points)

Q4.A1 (5 points) List THREE advantages of Decision Trees.

💡 Click to View Answer

Easy to interpret and visualize
- Can draw the tree and follow decision paths
- Non-technical stakeholders can understand the logic
- "If-then" rules are intuitive
No feature scaling required
- Works directly with raw data values
- Unlike SVM or KNN, doesn't need normalization
- Saves preprocessing time
Handles both numerical and categorical data
- Can split on continuous values (Age > 30)
- Can split on categories (Color == 'Red')
- Versatile for mixed datasets
Captures non-linear relationships
- Can model complex decision boundaries
- Doesn't assume linear separability
Shows feature importance
- Reveals which features matter most
- Helps with feature selection

Q4.A2 (5 points) Gini Index vs Information Gain. Which does CART use?

💡 Click to View Answer

Metric	Formula	Range (binary)	Used By
Gini Index	1 - Σ(pᵢ²)	0 to 0.5	CART
Entropy/Information Gain	-Σ(pᵢ log₂ pᵢ)	0 to 1	ID3, C4.5

CART (Classification and Regression Trees) uses Gini Index.

Why Gini?

Faster to compute (no logarithm)
Similar results to entropy in practice
Slightly favors larger partitions

Interpretation:

Gini = 0 → Pure node (all same class)
Gini = 0.5 → Maximum impurity (50/50 split)

Part B: Gini Calculation (15 points)

Scenario: Email classification with 20 emails (12 Spam, 8 Not Spam)

Split 1 - "Contains free":

Contains "free": 10 emails (9 Spam, 1 Not Spam)
No "free": 10 emails (3 Spam, 7 Not Spam)

Split 2 - "Contains meeting":

Contains "meeting": 8 emails (2 Spam, 6 Not Spam)
No "meeting": 12 emails (10 Spam, 2 Not Spam)

Q4.B1 (8 points) Calculate Gini Index for Split 1.

💡 Click to View Verified Answer

Formula: Gini = 1 - Σ(pᵢ²)

Step 1: Gini for "Contains free" node (10 emails: 9 Spam, 1 Not Spam)

P(Spam) = 9/10 = 0.9
P(Not Spam) = 1/10 = 0.1

Gini = 1 - (0.9² + 0.1²)
     = 1 - (0.81 + 0.01)
     = 1 - 0.82
     = 0.18

Step 2: Gini for "No free" node (10 emails: 3 Spam, 7 Not Spam)

P(Spam) = 3/10 = 0.3
P(Not Spam) = 7/10 = 0.7

Gini = 1 - (0.3² + 0.7²)
     = 1 - (0.09 + 0.49)
     = 1 - 0.58
     = 0.42

Step 3: Weighted Average Gini

Gini(Split 1) = (10/20) × 0.18 + (10/20) × 0.42
              = 0.5 × 0.18 + 0.5 × 0.42
              = 0.09 + 0.21
              = 0.30

Answer: Split 1 Gini = 0.30

Q4.B2 (7 points) Calculate Gini Index for Split 2. Which split is better?

💡 Click to View Verified Answer

Step 1: Gini for "Contains meeting" node (8 emails: 2 Spam, 6 Not Spam)

P(Spam) = 2/8 = 0.25
P(Not Spam) = 6/8 = 0.75

Gini = 1 - (0.25² + 0.75²)
     = 1 - (0.0625 + 0.5625)
     = 1 - 0.625
     = 0.375

Step 2: Gini for "No meeting" node (12 emails: 10 Spam, 2 Not Spam)

P(Spam) = 10/12 = 0.833
P(Not Spam) = 2/12 = 0.167

Gini = 1 - (0.833² + 0.167²)
     = 1 - (0.694 + 0.028)
     = 1 - 0.722
     = 0.278

Step 3: Weighted Average Gini

Gini(Split 2) = (8/20) × 0.375 + (12/20) × 0.278
              = 0.4 × 0.375 + 0.6 × 0.278
              = 0.15 + 0.167
              = 0.317

Answer: Split 2 Gini = 0.317

Comparison:

Split	Gini Index
Split 1 ("free")	0.30 ← Better
Split 2 ("meeting")	0.317

Better split: Split 1 ("Contains free")

Reason: Lower Gini = Lower impurity = Better separation of classes

🏁 End of Exam

Question	Topic	Points
Q1	Python Output Analysis	20
Q2	Code Writing (choose 3/5)	30
Q3	Pandas & ML Theory	25
Q4	Decision Tree & Gini	25
Total		100

📝 Key Formulas Reference

Concept	Formula
Gini Index	1 - Σ(pᵢ²)
Entropy	-Σ pᵢ log₂(pᵢ)
Info Gain	H(parent) - Σ weighted H(children)
Bayes	P(A\|B) ∝ P(B\|A) × P(A)
Z-score	(x - μ) / σ
MinMax	(x - min) / (max - min)

All code verified and tested. Show your work for partial credit. Good luck!

ABW505 Mock Exam 1 - Python & Machine Learning

📋 Exam Information

Item	Details
Total Points	100
Time Allowed	90 minutes
Format	Closed book, calculator allowed
Structure	Q1 (20pts) + Q2 (30pts, choose 3/5) + Q3 (25pts) + Q4 (25pts)

Question 1: Python Output Analysis (20 points)

Answer ALL questions. Determine exact output.

Q1.1 (5 points)

x = 15
y = 4
print((x // y) ** 2 + x % y)

💡 Click to View Answer & Explanation

Step-by-step breakdown:

# Given values
x = 15
y = 4
 
# Step 1: Floor division x // y
# 15 // 4 = 3 (integer part of 15/4 = 3.75)
floor_result = 15 // 4  # = 3
 
# Step 2: Modulo x % y  
# 15 % 4 = 3 (remainder when 15 divided by 4)
# 15 = 4 × 3 + 3, so remainder is 3
mod_result = 15 % 4  # = 3
 
# Step 3: Power (floor_result) ** 2
# 3 ** 2 = 9
power_result = 3 ** 2  # = 9
 
# Step 4: Addition
# 9 + 3 = 12
final = 9 + 3  # = 12

Answer: 12

Key operators explained:

Operator	Name	Example
`//`	Floor division	15 // 4 = 3
`%`	Modulo	15 % 4 = 3
`**`	Exponentiation	3 ** 2 = 9

Q1.2 (5 points)

numbers = [10, 20, 30, 40, 50]
numbers[1:4] = [100]
print(len(numbers))
print(numbers[2])

💡 Click to View Answer & Explanation

Step-by-step breakdown:

# Original list
numbers = [10, 20, 30, 40, 50]
# Indices:   0   1   2   3   4
 
# Slice assignment: numbers[1:4] = [100]
# This replaces elements at indices 1, 2, 3 with a single element 100
# Before: [10, 20, 30, 40, 50]
#              ^^^^^^^^^^^^ <- indices 1:4 (elements 20, 30, 40)
# After:  [10, 100, 50]
#              ^^^ <- replaced with single element
 
# Result after slice assignment
# numbers = [10, 100, 50]
# Indices:    0    1   2
 
# len(numbers) = 3 (was 5, replaced 3 elements with 1)
# numbers[2] = 50 (third element)

Answers:

len(numbers) → 3
numbers[2] → 50

Important concept: Slice assignment can change list size! Here we replaced 3 elements (indices 1, 2, 3) with 1 element, reducing length from 5 to 3.

Q1.3 (5 points)

def mystery(a, b=5, c=10):
    return a * 2 + b - c
 
result = mystery(3, c=4)
print(result)

💡 Click to View Answer & Explanation

Step-by-step breakdown:

# Function definition
def mystery(a, b=5, c=10):
    # a: required parameter
    # b: optional, default = 5
    # c: optional, default = 10
    return a * 2 + b - c
 
# Function call: mystery(3, c=4)
# a = 3 (positional argument, first position)
# b = 5 (uses default value, NOT provided in call)
# c = 4 (keyword argument, overrides default of 10)
 
# Calculation:
# a * 2 + b - c
# = 3 * 2 + 5 - 4
# = 6 + 5 - 4
# = 7

Answer: 7

Key concept: Keyword arguments (c=4) allow you to skip over parameters with defaults. Here b uses its default value of 5.

Q1.4 (5 points)

data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
total = 0
for key in data:
    total += sum(data[key])
print(total)

💡 Click to View Answer & Explanation

Step-by-step breakdown:

# Dictionary with lists as values
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
 
total = 0
 
# Iterating over a dictionary gives KEYS, not values
for key in data:  # key will be 'A', then 'B'
    
    # First iteration: key = 'A'
    # data['A'] = [1, 2, 3]
    # sum([1, 2, 3]) = 6
    # total = 0 + 6 = 6
    
    # Second iteration: key = 'B'
    # data['B'] = [4, 5, 6]
    # sum([4, 5, 6]) = 15
    # total = 6 + 15 = 21
    
    total += sum(data[key])
 
print(total)  # 21

Answer: 21

Calculation summary:

sum([1, 2, 3]) = 6
sum([4, 5, 6]) = 15
Total = 6 + 15 = 21

Question 2: Code Writing (30 points)

Choose 3 out of 5 questions. Each worth 10 points.

Q2.1 - Grade Calculator (10 points)

Write a function grade_calculator(score) that:

Returns letter grade: 90+ → "A", 80+ → "B", 70+ → "C", 60+ → "D", <60 → "F"
Returns "Invalid" for scores < 0 or > 100

💡 Click to View Verified Answer

def grade_calculator(score):
    """
    Convert numeric score to letter grade.
    
    Args:
        score: Numeric score (expected range: 0-100)
        
    Returns:
        str: Letter grade (A/B/C/D/F) or "Invalid" for out-of-range scores
    
    Examples:
        >>> grade_calculator(95)
        'A'
        >>> grade_calculator(-5)
        'Invalid'
    """
    # STEP 1: Validate input range FIRST
    # Must check invalid cases before checking grade ranges
    if score < 0 or score > 100:
        return "Invalid"
    
    # STEP 2: Check grades from highest to lowest
    # Using elif ensures only one condition matches
    if score >= 90:
        return "A"  # 90-100
    elif score >= 80:
        return "B"  # 80-89
    elif score >= 70:
        return "C"  # 70-79
    elif score >= 60:
        return "D"  # 60-69
    else:
        return "F"  # 0-59
 
 
# ===== Test Cases =====
if __name__ == "__main__":
    test_cases = [
        (95, "A"),
        (85, "B"),
        (73, "C"),
        (65, "D"),
        (45, "F"),
        (-5, "Invalid"),
        (105, "Invalid"),
        (100, "A"),  # Edge case: exactly 100
        (0, "F"),    # Edge case: exactly 0
    ]
    
    print("Testing grade_calculator:")
    for score, expected in test_cases:
        result = grade_calculator(score)
        status = "✓" if result == expected else "✗"
        print(f"  {status} grade_calculator({score}) = {result} (expected {expected})")

Test Output:

Testing grade_calculator:
  ✓ grade_calculator(95) = A (expected A)
  ✓ grade_calculator(85) = B (expected B)
  ✓ grade_calculator(73) = C (expected C)
  ✓ grade_calculator(65) = D (expected D)
  ✓ grade_calculator(45) = F (expected F)
  ✓ grade_calculator(-5) = Invalid (expected Invalid)
  ✓ grade_calculator(105) = Invalid (expected Invalid)
  ✓ grade_calculator(100) = A (expected A)
  ✓ grade_calculator(0) = F (expected F)

Common mistakes to avoid:

Not validating input range first
Using multiple if statements instead of elif
Checking in wrong order (e.g., 60+ before 90+)

Q2.2 - Remove Duplicates (10 points)

Write a function remove_duplicates(lst) that:

Removes duplicates from a list
Preserves the order of first occurrence
Example: [1, 2, 2, 3, 1, 4] → [1, 2, 3, 4]

💡 Click to View Verified Answer

def remove_duplicates(lst):
    """
    Remove duplicate elements while preserving order of first occurrence.
    
    Args:
        lst: Input list with possible duplicates
        
    Returns:
        list: New list with duplicates removed, order preserved
        
    Examples:
        >>> remove_duplicates([1, 2, 2, 3, 1, 4])
        [1, 2, 3, 4]
    """
    # Track elements we've already seen
    seen = []
    
    # Iterate through original list
    for item in lst:
        # Only add to result if not seen before
        if item not in seen:
            seen.append(item)
    
    return seen
 
 
# Alternative approach using dictionary (Python 3.7+ preserves order)
def remove_duplicates_v2(lst):
    """
    Remove duplicates using dict.fromkeys() - more efficient for large lists.
    Works because dictionaries preserve insertion order in Python 3.7+.
    """
    return list(dict.fromkeys(lst))
 
 
# ===== Test Cases =====
if __name__ == "__main__":
    test_cases = [
        [1, 2, 2, 3, 1, 4],        # Basic case
        [5, 5, 5, 5],              # All duplicates
        [1, 2, 3, 4],              # No duplicates
        [],                        # Empty list
        ['a', 'b', 'a', 'c'],      # Strings
    ]
    
    print("Testing remove_duplicates:")
    for test in test_cases:
        result = remove_duplicates(test)
        print(f"  {test} → {result}")

Test Output:

Testing remove_duplicates:
  [1, 2, 2, 3, 1, 4] → [1, 2, 3, 4]
  [5, 5, 5, 5] → [5]
  [1, 2, 3, 4] → [1, 2, 3, 4]
  [] → []
  ['a', 'b', 'a', 'c'] → ['a', 'b', 'c']

Why not use set()? Sets don't preserve order! list(set([1, 2, 2, 3, 1, 4])) might give [1, 2, 3, 4] but order is NOT guaranteed.

Q2.3 - Fibonacci Sequence (10 points)

Write a function fibonacci(n) that:

Returns the first n Fibonacci numbers as a list
Sequence: 0, 1, 1, 2, 3, 5, 8, 13...

💡 Click to View Verified Answer

def fibonacci(n):
    """
    Generate the first n Fibonacci numbers.
    
    Fibonacci sequence: Each number is the sum of the two preceding ones.
    Starts with 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
    
    Args:
        n: Number of Fibonacci numbers to generate (non-negative integer)
        
    Returns:
        list: First n Fibonacci numbers
        
    Examples:
        >>> fibonacci(5)
        [0, 1, 1, 2, 3]
        >>> fibonacci(0)
        []
    """
    # Handle edge cases
    if n <= 0:
        return []  # No numbers requested
    if n == 1:
        return [0]  # Only first number
    
    # Initialize with first two Fibonacci numbers
    result = [0, 1]
    
    # Generate remaining numbers
    for i in range(2, n):
        # Each new number = sum of last two numbers
        # Using negative indexing: result[-1] is last, result[-2] is second-to-last
        next_num = result[-1] + result[-2]
        result.append(next_num)
    
    return result
 
 
# ===== Test Cases =====
if __name__ == "__main__":
    test_cases = [0, 1, 2, 5, 8, 10]
    
    print("Testing fibonacci:")
    for n in test_cases:
        result = fibonacci(n)
        print(f"  fibonacci({n}) = {result}")

Test Output:

Testing fibonacci:
  fibonacci(0) = []
  fibonacci(1) = [0]
  fibonacci(2) = [0, 1]
  fibonacci(5) = [0, 1, 1, 2, 3]
  fibonacci(8) = [0, 1, 1, 2, 3, 5, 8, 13]
  fibonacci(10) = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

How it works:

Position:  0   1   2   3   4   5   6   7
Value:     0   1   1   2   3   5   8   13
                 ↑   ↑
           0+1=1  1+1=2  1+2=3  2+3=5  3+5=8

Q2.4 - Prime Number Check (10 points)

Write a function is_prime(num) that:

Returns True if num is prime, False otherwise
Handle edge cases (num < 2)

💡 Click to View Verified Answer

def is_prime(num):
    """
    Check if a number is prime.
    
    A prime number is a natural number greater than 1 that has no positive 
    divisors other than 1 and itself.
    
    Args:
        num: Integer to check
        
    Returns:
        bool: True if prime, False otherwise
        
    Examples:
        >>> is_prime(7)
        True
        >>> is_prime(12)
        False
    """
    # Numbers less than 2 are not prime by definition
    # This handles 0, 1, and negative numbers
    if num < 2:
        return False
    
    # 2 is the only even prime number
    if num == 2:
        return True
    
    # All other even numbers are not prime
    # (They're divisible by 2)
    if num % 2 == 0:
        return False
    
    # Check odd divisors from 3 up to √num
    # Why √num? If num = a × b, at least one of a,b must be ≤ √num
    # If no divisor found up to √num, num is prime
    for i in range(3, int(num ** 0.5) + 1, 2):  # Step by 2 (odd numbers only)
        if num % i == 0:
            return False  # Found a divisor, not prime
    
    return True  # No divisors found, it's prime
 
 
# ===== Test Cases =====
if __name__ == "__main__":
    test_cases = [
        (1, False),   # Not prime (less than 2)
        (2, True),    # Prime (smallest prime)
        (3, True),    # Prime
        (4, False),   # Not prime (2 × 2)
        (7, True),    # Prime
        (9, False),   # Not prime (3 × 3)
        (11, True),   # Prime
        (25, False),  # Not prime (5 × 5)
        (29, True),   # Prime
        (-5, False),  # Negative, not prime
    ]
    
    print("Testing is_prime:")
    for num, expected in test_cases:
        result = is_prime(num)
        status = "✓" if result == expected else "✗"
        print(f"  {status} is_prime({num}) = {result}")

Test Output:

Testing is_prime:
  ✓ is_prime(1) = False
  ✓ is_prime(2) = True
  ✓ is_prime(3) = True
  ✓ is_prime(4) = False
  ✓ is_prime(7) = True
  ✓ is_prime(9) = False
  ✓ is_prime(11) = True
  ✓ is_prime(25) = False
  ✓ is_prime(29) = True
  ✓ is_prime(-5) = False

Optimization: Checking up to √n instead of n reduces time complexity from O(n) to O(√n).

Q2.5 - Tuple Statistics (10 points)

Write a function tuple_stats(data) that:

Input: tuple of numbers
Return: tuple of (min, max, average rounded to 2 decimals)

💡 Click to View Verified Answer

def tuple_stats(data):
    """
    Calculate statistics for a tuple of numbers.
    
    Args:
        data: Tuple of numeric values
        
    Returns:
        tuple: (minimum, maximum, average) where average is rounded to 2 decimals
        
    Raises:
        ValueError: If tuple is empty
        
    Examples:
        >>> tuple_stats((10, 20, 30, 40))
        (10, 40, 25.0)
    """
    # Handle empty tuple edge case
    if len(data) == 0:
        raise ValueError("Cannot compute stats for empty tuple")
    
    # Calculate statistics using built-in functions
    minimum = min(data)        # Smallest value
    maximum = max(data)        # Largest value
    average = sum(data) / len(data)  # Arithmetic mean
    
    # Round average to 2 decimal places
    average = round(average, 2)
    
    # Return as tuple (note: using parentheses to make it clear)
    return (minimum, maximum, average)
 
 
# ===== Test Cases =====
if __name__ == "__main__":
    test_cases = [
        (10, 20, 30, 40),      # Even spread
        (5, 15, 25),           # Odd count
        (7,),                  # Single element
        (1, 2, 3, 4, 5, 6, 7, 8, 9, 10),  # Larger tuple
    ]
    
    print("Testing tuple_stats:")
    for data in test_cases:
        result = tuple_stats(data)
        print(f"  tuple_stats({data})")
        print(f"    → (min={result[0]}, max={result[1]}, avg={result[2]})")

Test Output:

Testing tuple_stats:
  tuple_stats((10, 20, 30, 40))
    → (min=10, max=40, avg=25.0)
  tuple_stats((5, 15, 25))
    → (min=5, max=25, avg=15.0)
  tuple_stats((7,))
    → (min=7, max=7, avg=7.0)
  tuple_stats((1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
    → (min=1, max=10, avg=5.5)

Question 3: Pandas & ML Basics (25 points)

Part A: Theory (10 points)

Q3.A1 (5 points) Explain the difference between fit() and predict() in scikit-learn.

💡 Click to View Answer

Method	Purpose	When Called	What It Does
`fit()`	Train the model	Once, on training data	Learns patterns/parameters from data
`predict()`	Use the model	On test/new data	Applies learned patterns to make predictions

Workflow example:

# Step 1: Create model
model = DecisionTreeClassifier()
 
# Step 2: Train model (learn from training data)
model.fit(X_train, y_train)  # Learns patterns
 
# Step 3: Use model (apply to new data)
predictions = model.predict(X_test)  # Makes predictions

Analogy:

fit() = studying for an exam
predict() = taking the exam

Q3.A2 (5 points) Why do we need train-test split? Why not use all data for training?

💡 Click to View Answer

Why train-test split is essential:

Evaluate on unseen data: We need to test how the model performs on data it hasn't seen during training.
Detect overfitting: If we train and test on the same data, the model might just memorize the answers (overfitting). Train-test split reveals if the model generalizes well.
Simulate real-world usage: In production, the model will encounter new, unseen data. Testing on held-out data simulates this.
Get honest performance estimate: Training accuracy is often misleadingly high; test accuracy gives a realistic measure.

What happens without split:

Model could achieve 100% accuracy on training data
But fail completely on new data
No way to detect this problem until deployment

Typical split ratios:

80/20 (training/test)
70/30 (training/test)

Part B: Pandas Code (15 points)

Given this DataFrame:

import pandas as pd
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 30, 35, None, 28],
    'Salary': [50000, 60000, 75000, 65000, None],
    'Department': ['IT', 'HR', 'IT', 'Finance', 'HR']
}
df = pd.DataFrame(data)

Q3.B1 (5 points) Fill missing Age with mean, missing Salary with 55000.

💡 Click to View Verified Answer

import pandas as pd
 
# Create the DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 30, 35, None, 28],
    'Salary': [50000, 60000, 75000, 65000, None],
    'Department': ['IT', 'HR', 'IT', 'Finance', 'HR']
}
df = pd.DataFrame(data)
 
print("Before filling:")
print(df)
print()
 
# Method 1: Using fillna with inplace=True
# Fill missing Age with mean
age_mean = df['Age'].mean()  # Calculate mean first (ignores NaN)
print(f"Age mean (excluding NaN): {age_mean}")  # = 29.5
df['Age'].fillna(age_mean, inplace=True)
 
# Fill missing Salary with 55000
df['Salary'].fillna(55000, inplace=True)
 
print("\nAfter filling:")
print(df)

Output:

Before filling:
      Name   Age   Salary Department
0    Alice  25.0  50000.0         IT
1      Bob  30.0  60000.0         HR
2  Charlie  35.0  75000.0         IT
3    David   NaN  65000.0    Finance
4      Eve  28.0      NaN         HR

Age mean (excluding NaN): 29.5

After filling:
      Name   Age   Salary Department
0    Alice  25.0  50000.0         IT
1      Bob  30.0  60000.0         HR
2  Charlie  35.0  75000.0         IT
3    David  29.5  65000.0    Finance
4      Eve  28.0  55000.0         HR

Note: mean() automatically ignores NaN values when calculating.

Q3.B2 (5 points) Calculate average salary by Department.

💡 Click to View Verified Answer

# Group by Department and calculate mean of Salary
avg_salary_by_dept = df.groupby('Department')['Salary'].mean()
 
print("Average Salary by Department:")
print(avg_salary_by_dept)

Output (after filling missing values):

Average Salary by Department:
Department
Finance    65000.0
HR         57500.0
IT         62500.0
Name: Salary, dtype: float64

Calculation breakdown:

Finance: 65000 (only David)
HR: (60000 + 55000) / 2 = 57500 (Bob + Eve)
IT: (50000 + 75000) / 2 = 62500 (Alice + Charlie)

Q3.B3 (5 points) Select IT employees with Age > 26.

💡 Click to View Verified Answer

# Filter with multiple conditions
# IMPORTANT: Use & for AND, | for OR
# IMPORTANT: Wrap each condition in parentheses
result = df[(df['Department'] == 'IT') & (df['Age'] > 26)]
 
print("IT employees with Age > 26:")
print(result)

Output:

IT employees with Age > 26:
      Name   Age   Salary Department
2  Charlie  35.0  75000.0         IT

Syntax rules for pandas filtering:

Use & instead of and
Use | instead of or
Use ~ instead of not
Wrap each condition in parentheses

Wrong: df[df['A'] == 1 and df['B'] == 2] Correct: df[(df['A'] == 1) & (df['B'] == 2)]

Question 4: Decision Tree & Naive Bayes (25 points)

Part A: Theory (10 points)

Q4.A1 (5 points) List THREE advantages of Decision Trees.

💡 Click to View Answer

Easy to interpret and visualize
- Can draw the tree and follow decision paths
- Non-technical stakeholders can understand the logic
- "If-then" rules are intuitive
No feature scaling required
- Works directly with raw data values
- Unlike SVM or KNN, doesn't need normalization
- Saves preprocessing time
Handles both numerical and categorical data
- Can split on continuous values (Age > 30)
- Can split on categories (Color == 'Red')
- Versatile for mixed datasets
Captures non-linear relationships
- Can model complex decision boundaries
- Doesn't assume linear separability
Shows feature importance
- Reveals which features matter most
- Helps with feature selection

Q4.A2 (5 points) Gini Index vs Information Gain. Which does CART use?

💡 Click to View Answer

Metric	Formula	Range (binary)	Used By
Gini Index	1 - Σ(pᵢ²)	0 to 0.5	CART
Entropy/Information Gain	-Σ(pᵢ log₂ pᵢ)	0 to 1	ID3, C4.5

CART (Classification and Regression Trees) uses Gini Index.

Why Gini?

Faster to compute (no logarithm)
Similar results to entropy in practice
Slightly favors larger partitions

Interpretation:

Gini = 0 → Pure node (all same class)
Gini = 0.5 → Maximum impurity (50/50 split)

Part B: Gini Calculation (15 points)

Scenario: Email classification with 20 emails (12 Spam, 8 Not Spam)

Split 1 - "Contains free":

Contains "free": 10 emails (9 Spam, 1 Not Spam)
No "free": 10 emails (3 Spam, 7 Not Spam)

Split 2 - "Contains meeting":

Contains "meeting": 8 emails (2 Spam, 6 Not Spam)
No "meeting": 12 emails (10 Spam, 2 Not Spam)

Q4.B1 (8 points) Calculate Gini Index for Split 1.

💡 Click to View Verified Answer

Formula: Gini = 1 - Σ(pᵢ²)

Step 1: Gini for "Contains free" node (10 emails: 9 Spam, 1 Not Spam)

P(Spam) = 9/10 = 0.9
P(Not Spam) = 1/10 = 0.1

Gini = 1 - (0.9² + 0.1²)
     = 1 - (0.81 + 0.01)
     = 1 - 0.82
     = 0.18

Step 2: Gini for "No free" node (10 emails: 3 Spam, 7 Not Spam)

P(Spam) = 3/10 = 0.3
P(Not Spam) = 7/10 = 0.7

Gini = 1 - (0.3² + 0.7²)
     = 1 - (0.09 + 0.49)
     = 1 - 0.58
     = 0.42

Step 3: Weighted Average Gini

Gini(Split 1) = (10/20) × 0.18 + (10/20) × 0.42
              = 0.5 × 0.18 + 0.5 × 0.42
              = 0.09 + 0.21
              = 0.30

Answer: Split 1 Gini = 0.30

Q4.B2 (7 points) Calculate Gini Index for Split 2. Which split is better?

💡 Click to View Verified Answer

Step 1: Gini for "Contains meeting" node (8 emails: 2 Spam, 6 Not Spam)

P(Spam) = 2/8 = 0.25
P(Not Spam) = 6/8 = 0.75

Gini = 1 - (0.25² + 0.75²)
     = 1 - (0.0625 + 0.5625)
     = 1 - 0.625
     = 0.375

Step 2: Gini for "No meeting" node (12 emails: 10 Spam, 2 Not Spam)

P(Spam) = 10/12 = 0.833
P(Not Spam) = 2/12 = 0.167

Gini = 1 - (0.833² + 0.167²)
     = 1 - (0.694 + 0.028)
     = 1 - 0.722
     = 0.278

Step 3: Weighted Average Gini

Gini(Split 2) = (8/20) × 0.375 + (12/20) × 0.278
              = 0.4 × 0.375 + 0.6 × 0.278
              = 0.15 + 0.167
              = 0.317

Answer: Split 2 Gini = 0.317

Comparison:

Split	Gini Index
Split 1 ("free")	0.30 ← Better
Split 2 ("meeting")	0.317

Better split: Split 1 ("Contains free")

Reason: Lower Gini = Lower impurity = Better separation of classes

🏁 End of Exam

Question	Topic	Points
Q1	Python Output Analysis	20
Q2	Code Writing (choose 3/5)	30
Q3	Pandas & ML Theory	25
Q4	Decision Tree & Gini	25
Total		100

📝 Key Formulas Reference

Concept	Formula
Gini Index	1 - Σ(pᵢ²)
Entropy	-Σ pᵢ log₂(pᵢ)
Info Gain	H(parent) - Σ weighted H(children)
Bayes	P(A\|B) ∝ P(B\|A) × P(A)
Z-score	(x - μ) / σ
MinMax	(x - min) / (max - min)

All code verified and tested. Show your work for partial credit. Good luck!

ABW505 Mock Exam 1 - Python & Machine Learning

📋 Exam Information

Question 1: Python Output Analysis (20 points)

Q1.1 (5 points)

Q1.2 (5 points)

Q1.3 (5 points)

Q1.4 (5 points)

Question 2: Code Writing (30 points)

Q2.1 - Grade Calculator (10 points)

Q2.2 - Remove Duplicates (10 points)

Q2.3 - Fibonacci Sequence (10 points)

Q2.4 - Prime Number Check (10 points)

Q2.5 - Tuple Statistics (10 points)

Question 3: Pandas & ML Basics (25 points)

Part A: Theory (10 points)

Part B: Pandas Code (15 points)

Question 4: Decision Tree & Naive Bayes (25 points)

Part A: Theory (10 points)

Part B: Gini Calculation (15 points)

🏁 End of Exam

📝 Key Formulas Reference

💬 评论

ABW505 Mock Exam 1 - Python & Machine Learning

📋 Exam Information

Question 1: Python Output Analysis (20 points)

Q1.1 (5 points)

Q1.2 (5 points)

Q1.3 (5 points)

Q1.4 (5 points)

Question 2: Code Writing (30 points)

Q2.1 - Grade Calculator (10 points)

Q2.2 - Remove Duplicates (10 points)

Q2.3 - Fibonacci Sequence (10 points)

Q2.4 - Prime Number Check (10 points)

Q2.5 - Tuple Statistics (10 points)

Question 3: Pandas & ML Basics (25 points)

Part A: Theory (10 points)

Part B: Pandas Code (15 points)

Question 4: Decision Tree & Naive Bayes (25 points)

Part A: Theory (10 points)

Part B: Gini Calculation (15 points)

🏁 End of Exam

📝 Key Formulas Reference

💬 评论