ABW505 Complete Question Bank - Python & Machine Learning

📚 All code in this document has been verified and tested. Every answer includes detailed explanations and comments.

📋 Exam Structure Overview

Section	Points	Type	Coverage
Q1	20	Output Analysis	Python basics: variables, operators, lists, tuples, functions, loops, conditions
Q2	30	Code Writing (3/5)	Decision structure, repetition, boolean logic, lists/tuples, functions
Q3	25	Theory + Code	Pandas, Data Preprocessing, Encoder, SVM, Random Forest
Q4	25	Theory + Calculation	Naive Bayes, Decision Tree, Gini Index, Entropy

📝 Q1: Python Output Analysis (20 points)

Key Pattern 1: Operator Precedence (MUST KNOW!)

Priority order: ** (power) → *, /, //, % → +, -

Operator	Meaning	Example
`**`	Power/Exponent	`5**2 = 25`
`/`	Division (float)	`5/2 = 2.5`
`//`	Floor division (integer)	`5//2 = 2`
`%`	Modulo (remainder)	`10%3 = 1`

Problem 1.1: Power and Floor Division

print(5**2 // 3)

💡 Click to View Answer

Step-by-step:

5**2 = 25 (power first, highest priority)
25 // 3 = 8 (floor division, discard remainder)

Answer: 8

Key concept: Power ** has higher priority than //. Floor division always rounds DOWN toward negative infinity.

Problem 1.2: Mixed Operations

print(3 + 4 * 4 // 4)

💡 Click to View Answer

Step-by-step:

4 * 4 = 16 (multiplication first)
16 // 4 = 4 (floor division, same priority as multiplication, left to right)
3 + 4 = 7 (addition last)

Answer: 7

Problem 1.3: Power and Multiplication

print(2 * 3 ** 2)

💡 Click to View Answer

Step-by-step:

3**2 = 9 (power first!)
2 * 9 = 18

Answer: 18

Common mistake: 2 * 3 = 6, then 6**2 = 36. WRONG! Power has higher priority.

Problem 1.4: Negative Floor Division (TRICKY!)

print(-5 // 3)

💡 Click to View Answer

Key insight: Floor division rounds toward NEGATIVE infinity, not toward zero!

-5 ÷ 3 = -1.666...
Rounding DOWN (toward -∞) → -2

Answer: -2

This is NOT the same as integer division in some other languages! Python's // always floors toward negative infinity.

Key Pattern 2: List Iteration and Sum

Problem 1.5: Calculate Average

numbers = [2, 4, 6, 8]
total = 0
for n in numbers:
    total += n
print(total / len(numbers))

💡 Click to View Answer

Trace:

Loop 1: total = 0 + 2 = 2
Loop 2: total = 2 + 4 = 6
Loop 3: total = 6 + 6 = 12
Loop 4: total = 12 + 8 = 20
Average: 20 / 4 = 5.0

Answer: 5.0

Note: Division / always returns a float in Python 3, so the answer is 5.0 not 5.

Key Pattern 3: Tuple Operations (MUST KNOW!)

Keywords: Tuple = Immutable list, 创建后不可修改, 用()定义

Problem 1.5a: Basic Tuple Index

t = ("study", "exercises", "exam")
print(t[1])

💡 Click to View Answer

Index map:

Element:  "study"  "exercises"  "exam"
Index:       0         1          2

Answer: exercises

Keywords: Tuple索引从0开始, 和list一样

Problem 1.5b: Tuple with len()

t = ("A", "B", "C")
print(len(t))

💡 Click to View Answer

Answer: 3

Keywords: len()数元素个数, tuple和list用法相同

Problem 1.5c: Tuple Negative Indexing

t = ("A", "B", "C")
print(t[-1])

💡 Click to View Answer

Negative index map:

Element:  "A"   "B"   "C"
Negative:  -3    -2    -1

Answer: C

Keywords: -1是最后一个元素, 负索引从右往左数

Problem 1.5d: Tuple Slicing

t = (1, 2, 3, 4, 5)
print(t[1:4])

💡 Click to View Answer

Slice rule: 左闭右开 (left-inclusive, right-exclusive)

Answer: (2, 3, 4)

Keywords: t[1:4]取index 1,2,3 (不含4), 返回的还是tuple

Problem 1.5e: Tuple Repetition

t = (1, 2)
print(t * 3)

💡 Click to View Answer

Answer: (1, 2, 1, 2, 1, 2)

Keywords: *重复操作, 和字符串类似

Problem 1.5f: Tuple is Immutable (TRICKY!)

t = (1, 2, 3)
t[0] = 100
print(t)

💡 Click to View Answer

Answer: TypeError (程序报错!)

Keywords: Tuple是immutable(不可变), 创建后不能修改元素

对比: List是mutable(可变), 可以修改元素

lst = [1, 2, 3]
lst[0] = 100  # ✅ 正常工作

Problem 1.5g: For Loop with Tuple

t = (2, 4, 6)
for x in t:
    print(x)

💡 Click to View Answer

Answer:

2
4
6

Keywords: Tuple支持for遍历, 和list完全一样

Problem 1.5h: List of Tuples (套娃题型!)

data = [("Ann", 80), ("Bob", 60)]
print(data[1])

💡 Click to View Answer

Key: 外层是list, 每个元素是tuple

Answer: ('Bob', 60)

Keywords: data[1]取list的第1个元素(整个tuple)

Problem 1.5i: Nested Indexing (双重索引)

data = [("Ann", 80), ("Bob", 60)]
print(data[1][0])

💡 Click to View Answer

Step-by-step:

data[1] = ("Bob", 60)
("Bob", 60)[0] = "Bob"

Answer: Bob

Keywords: 双重索引=套娃, 先取外层再取内层

Problem 1.5j: Tuple Unpacking with For Loop

data = [("Ann", 80), ("Bob", 60)]
for name, score in data:
    print(name)

💡 Click to View Answer

Key: Tuple自动解包, name和score分别接收tuple的两个元素

Answer:

Ann
Bob

Keywords: Tuple解包, 变量数量必须匹配tuple元素数量

Problem 1.5k: Mixed Tuple and List

data = [(1, 2), (3, 4), (5, 6)]
print(data[2][1])

💡 Click to View Answer

Step-by-step:

data[2] = (5, 6) (第3个tuple)
(5, 6)[1] = 6 (tuple的第2个元素)

Answer: 6

Problem 1.5l: in Operator with Tuple

t = ("X", "Y", "Z")
if "Y" in t:
    print("Y")
else:
    print("N")

💡 Click to View Answer

Answer: Y

Keywords: in检查元素是否存在, tuple和list都支持

Key Pattern 4: Function Basics (MUST KNOW!)

Keywords: def定义函数, return返回结果并结束函数, print负责输出

Problem 1.6a: Basic Function

def f(x):
    return x * 2
 
print(f(3))

💡 Click to View Answer

Step-by-step:

调用f(3), x=3
return 3*2 = 6
print(6)

Answer: 6

Keywords: 参数传值, return返回计算结果

Problem 1.6b: Function Without Print (TRICKY!)

def f(x):
    return x * 2
 
f(3)

💡 Click to View Answer

Answer: None (无输出!)

Keywords: return只返回值, 不负责输出! 没有print就没有显示!

关键区别:

return = 返回结果并结束函数 (不显示)
print = 输出到屏幕 (显示)
调用函数 ≠ 自动输出

Problem 1.6c: Multiple Parameters

def add(a, b):
    return a + b
 
print(add(2, 5))

💡 Click to View Answer

Answer: 7

Keywords: 多参数用逗号分隔, 2+5=7

Problem 1.6d: Function with Arithmetic

def f(x):
    return x + 1
 
print(f(2) + f(3) * 2)

💡 Click to View Answer

Step-by-step:

f(2) = 2+1 = 3
f(3) = 3+1 = 4
3 + 4*2 = 3 + 8 = 11 (乘法优先!)

Answer: 11

Keywords: 函数返回值参与运算, 遵守算术优先级

Problem 1.6e: Boolean Function

def is_even(n):
    return n % 2 == 0
 
print(is_even(5))

💡 Click to View Answer

Step-by-step:

5 % 2 = 1 (余数)
1 == 0? False

Answer: False

Keywords: Boolean函数返回True/False, %取余数

Problem 1.6f: Function with If (常见混合题型)

def check(n):
    if n > 10:
        print("Big")
    else:
        print("Small")
    return None
 
result = check(12)
print(result)

💡 Click to View Answer

Step-by-step:

check(12): 12>10成立, print("Big")
return None
print(result) → print(None)

Answer:

Big
None

Keywords: 函数内的print会执行, return None也会被打印

Problem 1.6g: Function with For Loop

def sum_list(a):
    s = 0
    for x in a:
        s += x
    return s
 
print(sum_list([1, 2, 3]))

💡 Click to View Answer

Trace:

s=0, x=1: s=0+1=1
x=2: s=1+2=3
x=3: s=3+3=6

Answer: 6

Keywords: 函数参数可以是list, 遍历累加

Problem 1.6h: Function Returning String

def grade(m):
    if m >= 50:
        return "pass"
    else:
        return "fail"
 
print(grade(45))

💡 Click to View Answer

Step-by-step:

grade(45): 45>=50? False
return "fail"

Answer: fail

Keywords: return可以返回任何类型, 包括字符串

Problem 1.6i: Nested Function Call (可能超纲)

def f(x):
    return x + 1
 
def g(x):
    return f(x) * 2
 
print(g(3))

💡 Click to View Answer

Step-by-step:

g(3) 调用 f(3)
f(3) = 3+1 = 4
g(3) = 4 * 2 = 8

Answer: 8

Keywords: 函数嵌套调用, 先执行内层函数

Key Pattern 5: String Slicing (Left-Inclusive, Right-Exclusive)

Problem 1.6: String Slice

s = "ABW505"
print(s[1:5])

💡 Click to View Answer

Index map:

Character:  A   B   W   5   0   5
Index:      0   1   2   3   4   5

s[1:5] → indices 1, 2, 3, 4 (NOT including 5)

Answer: BW50

Problem 1.7: Negative Indexing

text = "Hello World"
print(text[-5:-1])

💡 Click to View Answer

Index map:

Character: H   e   l   l   o       W   o   r   l   d
Positive:  0   1   2   3   4   5   6   7   8   9   10
Negative:-11 -10  -9  -8  -7  -6  -5  -4  -3  -2  -1

text[-5:-1] → from 'W' (index -5) to 'l' (index -2, NOT including -1)

Answer: Worl

Key Pattern 4: List Operations

Problem 1.8: Slice Assignment (TRICKY!)

numbers = [10, 20, 30, 40, 50]
numbers[1:4] = [100]
print(len(numbers))
print(numbers[2])

💡 Click to View Answer

Step-by-step:

Original: [10, 20, 30, 40, 50]
numbers[1:4] selects [20, 30, 40] (3 elements)
Replace with [100] (1 element)
Result: [10, 100, 50]
Length: 3
numbers[2] = 50

Answers:

len(numbers) → 3
numbers[2] → 50

Key concept: Slice assignment can change list size! Replacing 3 elements with 1 element reduces length by 2.

Problem 1.9: List Reference vs Copy

a = [1, 2, 3]
b = a
b.append(4)
print(a)
print(a is b)

💡 Click to View Answer

Key concept: b = a creates a REFERENCE, not a copy!

a and b point to the SAME list object
Modifying b also modifies a
a is b → True (same object in memory)

Answers:

print(a) → [1, 2, 3, 4]
print(a is b) → True

To create an independent copy: Use b = a.copy() or b = a[:]

Key Pattern 5: Functions with Default Arguments

Problem 1.10: Keyword Arguments

def mystery(a, b=5, c=10):
    return a * 2 + b - c
 
result = mystery(3, c=4)
print(result)

💡 Click to View Answer

Step-by-step:

a = 3 (positional argument)
b = 5 (uses default, NOT overridden)
c = 4 (keyword argument overrides default)
Calculation: 3 * 2 + 5 - 4 = 6 + 5 - 4 = 7

Answer: 7

Key concept: Keyword arguments let you skip over default parameters.

Key Pattern 6: Loops and Range

Problem 1.11: Range with Accumulator

total = 0
for i in range(1, 4):
    total += i
print(total)

💡 Click to View Answer

range(1, 4) generates: 1, 2, 3 (NOT including 4)

Accumulation: 0 + 1 + 2 + 3 = 6

Answer: 6

Problem 1.12: Break Statement

for i in range(5):
    if i == 2:
        break
    print(i)

💡 Click to View Answer

i=0: Print 0
i=1: Print 1
i=2: Break! Exit loop immediately

Answer:

0
1

Key Pattern 7: List Comprehension

Problem 1.13: Filtered List Comprehension

nums = [1, 2, 3, 4, 5]
result = [x**2 for x in nums if x % 2 == 1]
print(result)
print(sum(result))

💡 Click to View Answer

Step-by-step:

Filter odd numbers: 1, 3, 5 (where x % 2 == 1)
Square each: 1², 3², 5² = 1, 9, 25
Result: [1, 9, 25]
Sum: 1 + 9 + 25 = 35

Answers:

result → [1, 9, 25]
sum(result) → 35

Pattern: [expression for item in iterable if condition]

📝 Q2: Code Writing (30 points - Choose 3 of 5)

Template 1: Menu with List (MUST MEMORIZE!)

Problem: Write a Python program that displays this menu repeatedly:

Add a number to the list
Display the list
Exit

💡 Click to View Verified Answer

# Initialize empty list to store numbers
data = []
 
# Main program loop - runs until user chooses to exit
while True:
    # Display menu options with clear prompts
    print("\n--- MENU ---")
    print("1. Add a number to the list")
    print("2. Display the list")
    print("3. Exit")
    
    # Get user choice with prompt (IMPORTANT: include prompt text!)
    choice = input("Enter your choice (1/2/3): ")
    
    # Process user choice
    if choice == "1":
        # Option 1: Add number
        # Use try-except to handle invalid input gracefully
        try:
            num = int(input("Enter a number to add: "))
            data.append(num)
            print(f"Added {num} to the list.")
        except ValueError:
            print("Invalid input! Please enter a valid integer.")
    
    elif choice == "2":
        # Option 2: Display list
        if len(data) == 0:
            print("The list is empty.")
        else:
            print(f"Current list: {data}")
    
    elif choice == "3":
        # Option 3: Exit program
        print("Goodbye!")
        break
    
    else:
        # Handle invalid menu choice
        print("Invalid choice! Please enter 1, 2, or 3.")

Key improvements over the original buggy version:

✅ Added prompt text to input() - users know what to enter
✅ Added try-except for error handling - won't crash on invalid input
✅ Used string comparison instead of int - avoids crash if user enters text
✅ Added feedback messages - users know what happened
✅ Added empty list check - better user experience

ORIGINAL BUGGY VERSION (what was wrong):

# PROBLEMATIC CODE - DO NOT USE IN EXAM
data = []
while True:
    print("1.Add")
    print("2.Show")
    print("3.Exit")
    c = int(input())  # BUG: Crashes if user enters non-integer!
    
    if c == 1:
        data.append(int(input()))  # BUG: Crashes on invalid input, no prompt!
    elif c == 2:
        print(data)
    elif c == 3:
        break
# Missing: else clause, error handling, user prompts

Why it crashes: int(input()) without try-except will throw ValueError if user enters anything that's not a number (like pressing Enter, or typing "abc").

✍️ 手写精简版 (HANDWRITING VERSION)

只保留核心逻辑，去掉所有注释和错误处理：

data = []
while True:
    print("1.Add 2.Show 3.Exit")
    c = input("Choice: ")
    if c == "1":
        data.append(int(input("Num: ")))
    elif c == "2":
        print(data)
    elif c == "3":
        break

手写要点: 约10行, 必须有while True + break退出

Template 2: List with Sentinel Value (-1)

Problem: Write a Python program that:

Allows user to enter integers
Stops when user enters -1
Prints the minimum, maximum, and average

💡 Click to View Verified Answer

# Initialize empty list to store user's numbers
nums = []
 
print("Enter integers. Enter -1 to stop.")
 
# Main input loop
while True:
    try:
        # Get integer input with clear prompt
        n = int(input("Enter a number (-1 to stop): "))
        
        # Check for sentinel value
        if n == -1:
            break  # Exit loop when user enters -1
        
        # Add valid number to list
        nums.append(n)
        
    except ValueError:
        # Handle non-integer input
        print("Invalid input! Please enter an integer.")
 
# Calculate and display statistics
# IMPORTANT: Check if list is empty to avoid division by zero!
if len(nums) == 0:
    print("No numbers were entered.")
else:
    minimum = min(nums)
    maximum = max(nums)
    average = sum(nums) / len(nums)
    
    print(f"\nResults:")
    print(f"Minimum: {minimum}")
    print(f"Maximum: {maximum}")
    print(f"Average: {average:.2f}")  # .2f for 2 decimal places

Sample run:

Enter integers. Enter -1 to stop.
Enter a number (-1 to stop): 5
Enter a number (-1 to stop): 10
Enter a number (-1 to stop): 3
Enter a number (-1 to stop): -1

Results:
Minimum: 3
Maximum: 10
Average: 6.00

Edge case handling: Always check if list is empty before calculating statistics! min([]) and max([]) will raise ValueError, and sum([])/len([]) will raise ZeroDivisionError.

Template 3: Dictionary Operations

Problem: Write a Python program that:

Stores student names and marks in a dictionary
Allows multiple entries
Prints the average mark

💡 Click to View Verified Answer

# Initialize empty dictionary: {name: mark}
students = {}
 
print("Student Grade Recorder")
print("Enter student names and marks. Type 'stop' as name to finish.")
 
# Main input loop
while True:
    # Get student name with prompt
    name = input("\nEnter student name (or 'stop' to finish): ")
    
    # Check for stop condition (case-insensitive)
    if name.lower() == "stop":
        break
    
    # Check for empty name
    if name.strip() == "":
        print("Name cannot be empty!")
        continue
    
    # Get mark with error handling
    try:
        mark = int(input(f"Enter mark for {name}: "))
        
        # Optional: Validate mark range
        if mark < 0 or mark > 100:
            print("Warning: Mark is outside 0-100 range.")
        
        # Store in dictionary
        students[name] = mark
        print(f"Recorded: {name} = {mark}")
        
    except ValueError:
        print("Invalid mark! Please enter a number.")
 
# Calculate and display average
if len(students) == 0:
    print("\nNo students were recorded.")
else:
    # Get all marks using .values()
    all_marks = students.values()
    average = sum(all_marks) / len(all_marks)
    
    print(f"\n--- Student Records ---")
    for name, mark in students.items():
        print(f"{name}: {mark}")
    print(f"\nAverage mark: {average:.2f}")

Key dictionary operations:

dict.values() - get all values (marks)
dict.items() - get all key-value pairs
dict.keys() - get all keys (names)

✍️ 手写精简版 (HANDWRITING VERSION)

students = {}
while True:
    name = input("Name (stop to end): ")
    if name == "stop":
        break
    mark = int(input("Mark: "))
    students[name] = mark
# Calculate average
avg = sum(students.values()) / len(students)
print("Average:", avg)

手写要点: 约10行, dict存储, .values()取所有分数

Template 4: Grade Calculator (Decision Structure)

Problem: Write a function grade_calculator(score) that:

Returns letter grade: 90+ → "A", 80+ → "B", 70+ → "C", 60+ → "D", <60 → "F"
Returns "Invalid" for negative or > 100

💡 Click to View Verified Answer

def grade_calculator(score):
    """
    Convert numeric score to letter grade.
    
    Args:
        score: Numeric score (expected 0-100)
    
    Returns:
        str: Letter grade (A/B/C/D/F) or "Invalid"
    """
    # FIRST: Check for invalid input
    # Must check this BEFORE checking grade ranges
    if score < 0 or score > 100:
        return "Invalid"
    
    # Check grades from highest to lowest
    # Using elif ensures only ONE condition is matched
    if score >= 90:
        return "A"
    elif score >= 80:
        return "B"
    elif score >= 70:
        return "C"
    elif score >= 60:
        return "D"
    else:
        return "F"
 
 
# Test the function
if __name__ == "__main__":
    test_scores = [95, 85, 73, 65, 45, -5, 105]
    for s in test_scores:
        print(f"Score {s} → Grade {grade_calculator(s)}")

Output:

Score 95 → Grade A
Score 85 → Grade B
Score 73 → Grade C
Score 65 → Grade D
Score 45 → Grade F
Score -5 → Grade Invalid
Score 105 → Grade Invalid

Common mistakes:

Not checking invalid input FIRST
Using multiple if instead of elif (would return wrong grade)
Checking in wrong order (60+ before 90+)

✍️ 手写精简版 (HANDWRITING VERSION)

def grade(score):
    if score < 0 or score > 100:
        return "Invalid"
    if score >= 90: return "A"
    if score >= 80: return "B"
    if score >= 70: return "C"
    if score >= 60: return "D"
    return "F"

手写要点: 约8行, 先判断invalid, 从高到低判断

Template 5: Boolean Function (PREDICTED TOPIC!)

Problem: Write a function that returns True/False based on conditions (function + and/or/not)

💡 Click to View Examples

Example 1: Check if number is in range [10, 50]

def in_range(n):
    return n >= 10 and n <= 50
 
print(in_range(25))  # True
print(in_range(5))   # False

Example 2: Check if all three numbers are positive

def all_positive(a, b, c):
    return a > 0 and b > 0 and c > 0
 
print(all_positive(1, 2, 3))   # True
print(all_positive(1, -2, 3))  # False

Example 3: Check if at least one is even

def has_even(a, b, c):
    return a % 2 == 0 or b % 2 == 0 or c % 2 == 0
 
print(has_even(1, 3, 5))  # False
print(has_even(1, 2, 5))  # True

Example 4: Check if string is valid password

def is_valid_password(pwd):
    # At least 8 characters and contains digit
    has_length = len(pwd) >= 8
    has_digit = any(c.isdigit() for c in pwd)
    return has_length and has_digit
 
print(is_valid_password("abc12345"))  # True
print(is_valid_password("short1"))    # False

Boolean operators:

and = 两个都要成立
or = 至少一个成立
not = 取反

Keywords: return True/False, 条件组合

✍️ 手写精简版 (HANDWRITING VERSION)

def is_valid(x, y, z):
    return x > 0 and y > 0 and z > 0

手写要点: 1行return即可, 用and/or组合条件

Template 6: Prime Number Check

Problem: Write a function is_prime(num) that returns True if prime, False otherwise.

💡 Click to View Verified Answer

def is_prime(num):
    """
    Check if a number is prime.
    
    A prime number is:
    - Greater than 1
    - Only divisible by 1 and itself
    
    Args:
        num: Integer to check
    
    Returns:
        bool: True if prime, False otherwise
    """
    # Numbers less than 2 are not prime
    # (0, 1, and negative numbers)
    if num < 2:
        return False
    
    # 2 is the only even prime
    if num == 2:
        return True
    
    # All other even numbers are not prime
    if num % 2 == 0:
        return False
    
    # Check odd divisors up to square root of num
    # Why sqrt? If n = a × b, one of a,b must be ≤ √n
    # We use int(num ** 0.5) + 1 to include the square root
    for i in range(3, int(num ** 0.5) + 1, 2):  # Step by 2 (odd numbers only)
        if num % i == 0:
            return False  # Found a divisor, not prime
    
    return True  # No divisors found, it's prime
 
 
# Test the function
if __name__ == "__main__":
    test_nums = [1, 2, 3, 7, 10, 11, 25, 29]
    for n in test_nums:
        result = "Prime" if is_prime(n) else "Not Prime"
        print(f"{n}: {result}")

Output:

1: Not Prime
2: Prime
3: Prime
7: Prime
10: Not Prime
11: Prime
25: Not Prime
29: Prime

Optimization: Only checking up to √n reduces time complexity from O(n) to O(√n).

Template 6: Fibonacci Sequence

Problem: Write a function fibonacci(n) that returns the first n Fibonacci numbers as a list.

💡 Click to View Verified Answer

def fibonacci(n):
    """
    Generate the first n Fibonacci numbers.
    
    Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, ...
    Each number is the sum of the two preceding numbers.
    
    Args:
        n: Number of Fibonacci numbers to generate
    
    Returns:
        list: First n Fibonacci numbers
    """
    # Handle edge cases
    if n <= 0:
        return []  # Empty list for invalid input
    if n == 1:
        return [0]  # Only the first number
    
    # Start with first two Fibonacci numbers
    result = [0, 1]
    
    # Generate remaining numbers
    for i in range(2, n):
        # Each new number = sum of last two
        next_num = result[-1] + result[-2]  # Use negative indexing
        result.append(next_num)
    
    return result
 
 
# Test the function
if __name__ == "__main__":
    for count in [0, 1, 5, 10]:
        print(f"fibonacci({count}) = {fibonacci(count)}")

Output:

fibonacci(0) = []
fibonacci(1) = [0]
fibonacci(5) = [0, 1, 1, 2, 3]
fibonacci(10) = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Template 7: Remove Duplicates (Preserve Order)

Problem: Write a function that removes duplicates from a list while preserving the order of first occurrence.

💡 Click to View Verified Answer

def remove_duplicates(lst):
    """
    Remove duplicate elements while preserving first occurrence order.
    
    Example: [1, 2, 2, 3, 1, 4] → [1, 2, 3, 4]
    
    Args:
        lst: Input list with possible duplicates
    
    Returns:
        list: New list with duplicates removed
    """
    seen = []  # Track items we've already seen
    
    for item in lst:
        if item not in seen:  # Only add if not seen before
            seen.append(item)
    
    return seen
 
 
# Alternative using dict (Python 3.7+ preserves insertion order)
def remove_duplicates_v2(lst):
    """
    Remove duplicates using dictionary (more efficient for large lists).
    dict.fromkeys() preserves first occurrence order.
    """
    return list(dict.fromkeys(lst))
 
 
# Test both versions
if __name__ == "__main__":
    test = [1, 2, 2, 3, 1, 4, 5, 3, 2]
    print(f"Original: {test}")
    print(f"Method 1: {remove_duplicates(test)}")
    print(f"Method 2: {remove_duplicates_v2(test)}")

Output:

Original: [1, 2, 2, 3, 1, 4, 5, 3, 2]
Method 1: [1, 2, 3, 4, 5]
Method 2: [1, 2, 3, 4, 5]

Why not use set()? Sets don't preserve order! list(set([1, 2, 2, 3, 1, 4])) might give [1, 2, 3, 4] but order is not guaranteed.

Template 8: Exception Handling

Problem: Write a program that repeatedly asks for a nonzero integer and calculates its reciprocal, handling invalid inputs.

💡 Click to View Verified Answer

def get_reciprocal():
    """
    Get a nonzero integer from user and calculate its reciprocal.
    Handles ValueError (non-integer) and ZeroDivisionError (zero input).
    """
    while True:
        try:
            # Get input from user
            n = int(input("Enter a nonzero integer: "))
            
            # Calculate reciprocal (will raise ZeroDivisionError if n=0)
            reciprocal = 1 / n
            
            # If we get here, input was valid
            print(f"The reciprocal of {n} is {reciprocal:.3f}")
            break  # Exit loop on success
            
        except ValueError:
            # int() failed - input was not a valid integer
            print("Error: You did not enter a valid integer. Try again.")
            
        except ZeroDivisionError:
            # Division by zero
            print("Error: You entered zero. Cannot divide by zero. Try again.")
 
 
# Run the function
if __name__ == "__main__":
    get_reciprocal()

Sample run:

Enter a nonzero integer: abc
Error: You did not enter a valid integer. Try again.
Enter a nonzero integer: 0
Error: You entered zero. Cannot divide by zero. Try again.
Enter a nonzero integer: 4
The reciprocal of 4 is 0.250

📝 Q3: Machine Learning - Theory & Code (25 points)

Flowchart Symbols (MUST KNOW!)

Keywords: 流程图用于设计和解释程序逻辑

💡 Click to View All 5 Symbols

Symbol	Shape	Name	Purpose
⬭	Oval	Terminal	Start/End of the flowchart
▱	Parallelogram	I/O	Input/Output operations (e.g., enter values, display results)
▭	Rectangle	Process	Processing/Calculation (e.g., x = a + b)
◇	Diamond	Decision	Condition check (Yes/No branches)
→	Arrow	Flow Line	Direction of flow in program logic

Example question: Draw a flowchart to find the largest among three numbers (a, b, c).

Flowchart structure:

[Start] → [Input a, b, c] → <a > b?> 
                               ↓Yes        ↓No
                           <a > c?>    <b > c?>
                           ↓Yes  ↓No   ↓Yes  ↓No
                         [max=a][max=c][max=b][max=c]
                               ↓ ↓ ↓ ↓
                         [Output max] → [End]

Key points for exam:

Start/End: 必须有开始和结束符号
Input: 在处理前获取输入
Decision: 用菱形表示条件判断，有Yes/No两个分支
Process: 矩形框内写计算操作
Arrows: 所有符号用箭头连接，指示流程方向

Algorithm Comparison Table (MUST KNOW!)

Keywords: 根据数据特征选择合适的模型

Algorithm	Best For	Pros	Cons	When to Use?
Naive Bayes	Small, low-dimensional	Fast, simple, low variance	Independence assumption unrealistic	Compare probabilities, pick highest
Logistic Regression	Small-medium data	Interpretable, stable	Linear separation only	Linear relationship, probability output
Decision Tree	Small-medium data	Easy to understand, visual	Prone to overfitting	Need clear rules, explainable
KNN	Small data	Simple, no training	Slow, sensitive to noise	Small data, low dimensions, classification
SVM	Small, high-dimensional	High accuracy, good generalization	Complex, hard to tune	High-dimensional data, margin-based
Random Forest	Medium data	Accurate, resists overfitting	Less interpretable	Improved bagging decision tree

💡 Model Selection Quick Rules (考试速查)

Q: 小数据、低维度、分类问题，选哪个模型？ A: Naive Bayes - 表现好、方差低、需要数据少、专为分类设计

Q: Random Forest比Decision Tree好在哪？ A: 减少过拟合 - 通过组合多个决策树(bagging)来提高准确性

Q: K值增大会怎样？ A: K变大 → 方差减小(更稳定) + 偏差增大(更偏)

Q: SVM适合什么数据？ A: 高维数据强，大数据慢 - Works well in high-dimensional spaces, but computationally expensive for large datasets

Q: Why use Encoder before ML model? A: ML需要数值输入 - Machine learning models require numerical input; encoders transform categorical data into numbers

Pandas Basics

💡 Click to View Common Operations

import pandas as pd
 
# ========== Reading Data ==========
# Read CSV file
df = pd.read_csv("data.csv")
 
# Display first/last rows
print(df.head())   # First 5 rows
print(df.tail())   # Last 5 rows
print(df.shape)    # (rows, columns)
 
# ========== Handling Missing Values ==========
# Check for missing values
print(df.isnull().sum())  # Count of nulls per column
 
# Fill missing values
df['Age'].fillna(df['Age'].mean(), inplace=True)      # Fill with mean
df['Age'].fillna(df['Age'].median(), inplace=True)    # Fill with median
df['Salary'].fillna(50000, inplace=True)              # Fill with specific value
 
# Drop rows with missing values
df.dropna(inplace=True)
 
# ========== Selecting Data ==========
# Select single column
ages = df['Age']
 
# Select multiple columns
subset = df[['Name', 'Age']]
 
# Filter rows
adults = df[df['Age'] >= 18]
 
# Multiple conditions (use & for AND, | for OR)
result = df[(df['Age'] >= 18) & (df['Department'] == 'IT')]
 
# ========== Grouping ==========
# Group by and aggregate
avg_salary = df.groupby('Department')['Salary'].mean()

LabelEncoder for Categorical Data

💡 Click to View Example

from sklearn.preprocessing import LabelEncoder
import pandas as pd
 
# Sample data with categorical columns
data = {
    'Color': ['Red', 'Blue', 'Green', 'Red', 'Blue'],
    'Size': ['Small', 'Medium', 'Large', 'Medium', 'Small']
}
df = pd.DataFrame(data)
 
# Create LabelEncoder instance
le = LabelEncoder()
 
# Encode each categorical column
# LabelEncoder sorts values alphabetically then assigns 0, 1, 2...
for col in df.columns:
    if df[col].dtype == 'object':  # Check if column is string/object type
        df[col] = le.fit_transform(df[col])
 
print(df)
# Color encoding: Blue=0, Green=1, Red=2 (alphabetical)
# Size encoding: Large=0, Medium=1, Small=2 (alphabetical)

LabelEncoder mapping (always alphabetical):

Original	Encoded
Blue	0
Green	1
Red	2

Train-Test Split

💡 Click to View Example

from sklearn.model_selection import train_test_split
 
# Assume X = features, y = target variable
X = df.drop('target', axis=1)  # All columns except target
y = df['target']               # Target column only
 
# Split data: 80% training, 20% testing
X_train, X_test, y_train, y_test = train_test_split(
    X, y, 
    test_size=0.2,      # 20% for testing
    random_state=42     # For reproducibility
)
 
print(f"Training set size: {len(X_train)}")
print(f"Testing set size: {len(X_test)}")

Why train-test split?

Evaluate model on UNSEEN data
Detect overfitting (memorizing training data)
Simulate real-world usage
Get honest performance estimate

fit() vs predict()

💡 Click to View Explanation

Method	Purpose	When Used
`fit()`	Train the model	Once, on training data only
`predict()`	Apply the model	On test/new data
`fit_transform()`	Fit and transform in one step	For preprocessing (scaler, encoder)

Important:

Use fit_transform() on training data
Use transform() only on test data (NOT fit_transform!)

# CORRECT workflow for scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)  # Fit AND transform
X_test_scaled = scaler.transform(X_test)         # Transform only (no fit!)

SVM Implementation

💡 Click to View Complete Code

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report
 
# Step 1: Load data
df = pd.read_csv("data.csv")
 
# Step 2: Encode categorical features
le = LabelEncoder()
for col in df.select_dtypes(include=['object']).columns:
    if col != 'target':  # Don't encode target yet if needed later
        df[col] = le.fit_transform(df[col])
 
# Step 3: Separate features (X) and target (y)
X = df.drop('target', axis=1)
y = df['target']
 
# Step 4: Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)
 
# Step 5: Scale features (IMPORTANT for SVM!)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)  # Fit on training data
X_test_scaled = scaler.transform(X_test)         # Transform only on test data
 
# Step 6: Create and train SVM model
svm_model = SVC(kernel='rbf', random_state=42)  # RBF kernel is default
svm_model.fit(X_train_scaled, y_train)
 
# Step 7: Make predictions
y_pred = svm_model.predict(X_test_scaled)
 
# Step 8: Evaluate
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

SVM Key Points:

Scaling is REQUIRED - SVM is sensitive to feature magnitudes
Kernel trick - transforms data to higher dimensions for separation
Common kernels: 'linear', 'rbf' (Gaussian), 'poly' (polynomial)

Random Forest Implementation

💡 Click to View Complete Code

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
 
# Step 1: Load and prepare data
df = pd.read_csv("data.csv")
 
# Step 2: Handle categorical (if needed)
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
for col in df.select_dtypes(include=['object']).columns:
    df[col] = le.fit_transform(df[col])
 
# Step 3: Split features and target
X = df.drop('target', axis=1)
y = df['target']
 
# Step 4: Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)
 
# Step 5: Create and train Random Forest
# Note: No scaling needed for tree-based models!
rf_model = RandomForestClassifier(
    n_estimators=100,    # Number of trees
    max_depth=None,      # No limit on depth
    random_state=42
)
rf_model.fit(X_train, y_train)
 
# Step 6: Predictions and evaluation
y_train_pred = rf_model.predict(X_train)
y_test_pred = rf_model.predict(X_test)
 
print(f"Training Accuracy: {accuracy_score(y_train, y_train_pred):.2f}")
print(f"Testing Accuracy: {accuracy_score(y_test, y_test_pred):.2f}")
 
# Step 7: Feature importance
importance = pd.DataFrame({
    'feature': X.columns,
    'importance': rf_model.feature_importances_
}).sort_values('importance', ascending=False)
print("\nFeature Importance:")
print(importance)

Random Forest Key Points:

Bagging: Creates multiple trees on different data subsets
Reduces overfitting: Averaging many trees is more stable
No scaling needed: Tree-based methods don't need scaling
Feature importance: Shows which features matter most

StandardScaler vs MinMaxScaler

💡 Click to View Comparison

Scaler	Formula	Output Range	Best For
StandardScaler	(x - mean) / std	Mean=0, Std=1	SVM, Logistic Regression, data with outliers
MinMaxScaler	(x - min) / (max - min)	[0, 1]	Neural Networks, KNN, bounded features

Quick rule:

SVM, Linear models → StandardScaler
Neural networks, images → MinMaxScaler

from sklearn.preprocessing import StandardScaler, MinMaxScaler
 
# StandardScaler: Z-score normalization
scaler1 = StandardScaler()
X_standard = scaler1.fit_transform(X)
 
# MinMaxScaler: Scale to [0, 1]
scaler2 = MinMaxScaler()
X_minmax = scaler2.fit_transform(X)

📝 Q4: Naive Bayes & Decision Tree (25 points)

Bayes' Theorem Formula

$$P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}$$

Where:

$P(A|B)$ = Posterior probability (what we want)
$P(B|A)$ = Likelihood
$P(A)$ = Prior probability
$P(B)$ = Evidence (normalizing constant)

Naive Bayes Calculation Example

💡 Click to View Worked Example

Dataset: Classify emails as Spam or Not Spam

Email	Contains "Free"	Contains "Winner"	Spam?
1	Yes	Yes	Spam
2	Yes	No	Spam
3	No	Yes	Spam
4	No	No	Not Spam
5	Yes	No	Not Spam
6	No	No	Not Spam

Question: New email has "Free"=Yes, "Winner"=No. Is it Spam?

Step 1: Calculate Priors

P(Spam) = 3/6 = 0.5
P(Not Spam) = 3/6 = 0.5

Step 2: Calculate Likelihoods

For Spam emails (1, 2, 3):

P(Free=Yes | Spam) = 2/3 (emails 1, 2)
P(Winner=No | Spam) = 1/3 (email 2 only)

For Not Spam emails (4, 5, 6):

P(Free=Yes | Not Spam) = 1/3 (email 5)
P(Winner=No | Not Spam) = 3/3 = 1 (all three)

Step 3: Calculate Unnormalized Posteriors

$P(Spam | evidence) \propto P(Spam) \times P(Free=Yes|Spam) \times P(Winner=No|Spam)$ $= 0.5 \times \frac{2}{3} \times \frac{1}{3} = 0.111$

$P(Not Spam | evidence) \propto 0.5 \times \frac{1}{3} \times 1 = 0.167$

Step 4: Normalize $P(Spam) = \frac{0.111}{0.111 + 0.167} = \frac{0.111}{0.278} = 0.40 = 40%$

Prediction: NOT SPAM (40% < 50%)

Gini Index Formula & Calculation

$$Gini = 1 - \sum_{i=1}^{n} p_i^2$$

Where $p_i$ is the proportion of class $i$ in the node.

💡 Click to View Worked Example

Dataset: 20 emails (12 Spam, 8 Not Spam)

Split by "Contains Free":

Contains "free": 10 emails (9 Spam, 1 Not Spam)
No "free": 10 emails (3 Spam, 7 Not Spam)

Step 1: Gini for "Contains Free" node (9S, 1N)

P(Spam) = 9/10 = 0.9
P(Not Spam) = 1/10 = 0.1
Gini = 1 - (0.9² + 0.1²) = 1 - (0.81 + 0.01) = 0.18

Step 2: Gini for "No Free" node (3S, 7N)

P(Spam) = 3/10 = 0.3
P(Not Spam) = 7/10 = 0.7
Gini = 1 - (0.3² + 0.7²) = 1 - (0.09 + 0.49) = 0.42

Step 3: Weighted Average Gini $Gini_{split} = \frac{10}{20} \times 0.18 + \frac{10}{20} \times 0.42$ $= 0.5 \times 0.18 + 0.5 \times 0.42 = 0.09 + 0.21 = 0.30$

Final Answer: Gini for this split = 0.30

Interpretation: Lower Gini = better split. Pure node has Gini = 0.

Information Gain (Entropy)

$$Entropy = -\sum_{i=1}^{n} p_i \log_2(p_i)$$

$$Information\ Gain = Entropy(parent) - \sum_{children} \frac{n_{child}}{n_{parent}} \times Entropy(child)$$

💡 Click to View Worked Example

Parent node: 3 Spam, 3 Not Spam (50/50 split)

Parent Entropy (perfect balance = maximum entropy): $H(parent) = -0.5 \log_2(0.5) - 0.5 \log_2(0.5)$ $= -0.5(-1) - 0.5(-1) = 0.5 + 0.5 = 1.0$

Child node "Free=Yes" (2 Spam, 1 Not Spam): $H = -\frac{2}{3} \log_2(\frac{2}{3}) - \frac{1}{3} \log_2(\frac{1}{3})$ $= 0.390 + 0.528 = 0.918$

Child node "Free=No" (1 Spam, 2 Not Spam): $H = -\frac{1}{3} \log_2(\frac{1}{3}) - \frac{2}{3} \log_2(\frac{2}{3}) = 0.918$

Weighted Entropy: $= \frac{3}{6}(0.918) + \frac{3}{6}(0.918) = 0.918$

Information Gain: $IG = 1.0 - 0.918 = 0.082$

Complete Entropy Calculation Example (EXAM FORMAT!)

💡 Click to View Full Decision Tree Example

Dataset: Predict if student will pass

GPA	Studied	Passed
Low	No	No
Low	Yes	No
Med	No	No
Med	Yes	Yes
High	No	Yes
High	Yes	Yes

Question: Calculate H(Passed), H(Passed|GPA), H(Passed|Studied), then draw decision tree.

Step 1: Calculate H(Passed) - 目标变量的熵

Passed=Yes: 3个 → P(Yes) = 3/6 = 0.5
Passed=No: 3个 → P(No) = 3/6 = 0.5

$H(Passed) = -0.5 \log_2(0.5) - 0.5 \log_2(0.5)$ $= -0.5 \times (-1) - 0.5 \times (-1) = 0.5 + 0.5 = 1.0$

Answer: H(Passed) = 1.0 (完美50/50分布 = 最大熵)

Step 2: Calculate H(Passed | GPA) - 按GPA分组的条件熵

GPA = Low (2条记录: 0 Yes, 2 No)

H(Low) = -0 \log_2(0) - 1 \log_2(1) = 0 (纯节点!)

GPA = Med (2条记录: 1 Yes, 1 No)

H(Med) = -0.5 \log_2(0.5) - 0.5 \log_2(0.5) = 1.0

GPA = High (2条记录: 2 Yes, 0 No)

H(High) = -1 \log_2(1) - 0 \log_2(0) = 0 (纯节点!)

Weighted Average: $H(Passed|GPA) = \frac{2}{6} \times 0 + \frac{2}{6} \times 1.0 + \frac{2}{6} \times 0$ $= 0 + 0.333 + 0 = 0.333$

Answer: H(Passed|GPA) = 0.333

Step 3: Calculate H(Passed | Studied) - 按Studied分组的条件熵

Studied = No (3条记录: 1 Yes, 2 No)

P(Yes) = 1/3, P(No) = 2/3
H(No) = -1/3 \log_2(1/3) - 2/3 \log_2(2/3)
= 0.528 + 0.390 = 0.918

Studied = Yes (3条记录: 2 Yes, 1 No)

P(Yes) = 2/3, P(No) = 1/3
H(Yes) = 0.918 (对称)

Weighted Average: $H(Passed|Studied) = \frac{3}{6} \times 0.918 + \frac{3}{6} \times 0.918 = 0.918$

Answer: H(Passed|Studied) = 0.918

Step 4: Compare Information Gain

IG(GPA) = H(Passed) - H(Passed|GPA) = 1.0 - 0.333 = 0.667 ✅ 更高!
IG(Studied) = H(Passed) - H(Passed|Studied) = 1.0 - 0.918 = 0.082

选GPA作为根节点 (信息增益更高)

Step 5: Draw Decision Tree

         [GPA?]
       /   |   \
    Low   Med   High
     ↓     ↓      ↓
   [No] [Studied?] [Yes]
          /    \
        No     Yes
         ↓       ↓
       [No]   [Yes]

Log值速查表 (考试可用计算器):

log₂(0.5) = -1
log₂(1) = 0
log₂(1/3) ≈ -1.585
log₂(2/3) ≈ -0.585

规则: 0 × log₂(0) = 0 (按约定)

Decision Tree Code Template

💡 Click to View Complete Code

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report
 
# Load data
df = pd.read_csv("data.csv")
 
# Encode categorical if needed
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
for col in df.select_dtypes(include=['object']).columns:
    df[col] = le.fit_transform(df[col])
 
# Split features and target
X = df.drop('target', axis=1)
y = df['target']
 
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)
 
# Create and train Decision Tree
# criterion='gini' is default (CART algorithm)
# criterion='entropy' uses Information Gain (ID3/C4.5)
dt_model = DecisionTreeClassifier(
    criterion='gini',    # or 'entropy'
    max_depth=5,         # Limit depth to prevent overfitting
    random_state=42
)
dt_model.fit(X_train, y_train)
 
# Evaluate
y_pred = dt_model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print(classification_report(y_test, y_pred))

🎯 Quick Reference Checklist

Python Essentials

Operator precedence: ** > *,/,//,% > +,-
String slicing: left-inclusive, right-exclusive
Floor division // rounds toward negative infinity
range(a,b) generates a to b-1
List assignment creates reference, not copy

Machine Learning Essentials

Bayes formula: P(A|B) = P(B|A) × P(A) / P(B)
Gini: 1 - Σ(pᵢ²)
Entropy: -Σ pᵢ log₂(pᵢ)
SVM: needs scaling, uses kernel trick
Random Forest: reduces overfitting via bagging
Decision Tree: uses Gini (CART) or Entropy (ID3)

💪 Good luck on your exam! 🎓

All code in this document has been verified and tested.

ABW505 Complete Question Bank - Python & Machine Learning

📚 All code in this document has been verified and tested. Every answer includes detailed explanations and comments.

📋 Exam Structure Overview

Section	Points	Type	Coverage
Q1	20	Output Analysis	Python basics: variables, operators, lists, tuples, functions, loops, conditions
Q2	30	Code Writing (3/5)	Decision structure, repetition, boolean logic, lists/tuples, functions
Q3	25	Theory + Code	Pandas, Data Preprocessing, Encoder, SVM, Random Forest
Q4	25	Theory + Calculation	Naive Bayes, Decision Tree, Gini Index, Entropy

📝 Q1: Python Output Analysis (20 points)

Key Pattern 1: Operator Precedence (MUST KNOW!)

Priority order: ** (power) → *, /, //, % → +, -

Operator	Meaning	Example
`**`	Power/Exponent	`5**2 = 25`
`/`	Division (float)	`5/2 = 2.5`
`//`	Floor division (integer)	`5//2 = 2`
`%`	Modulo (remainder)	`10%3 = 1`

Problem 1.1: Power and Floor Division

print(5**2 // 3)

💡 Click to View Answer

Step-by-step:

5**2 = 25 (power first, highest priority)
25 // 3 = 8 (floor division, discard remainder)

Answer: 8

Key concept: Power ** has higher priority than //. Floor division always rounds DOWN toward negative infinity.

Problem 1.2: Mixed Operations

print(3 + 4 * 4 // 4)

💡 Click to View Answer

Step-by-step:

4 * 4 = 16 (multiplication first)
16 // 4 = 4 (floor division, same priority as multiplication, left to right)
3 + 4 = 7 (addition last)

Answer: 7

Problem 1.3: Power and Multiplication

print(2 * 3 ** 2)

💡 Click to View Answer

Step-by-step:

3**2 = 9 (power first!)
2 * 9 = 18

Answer: 18

Common mistake: 2 * 3 = 6, then 6**2 = 36. WRONG! Power has higher priority.

Problem 1.4: Negative Floor Division (TRICKY!)

print(-5 // 3)

💡 Click to View Answer

Key insight: Floor division rounds toward NEGATIVE infinity, not toward zero!

-5 ÷ 3 = -1.666...
Rounding DOWN (toward -∞) → -2

Answer: -2

This is NOT the same as integer division in some other languages! Python's // always floors toward negative infinity.

Key Pattern 2: List Iteration and Sum

Problem 1.5: Calculate Average

numbers = [2, 4, 6, 8]
total = 0
for n in numbers:
    total += n
print(total / len(numbers))

💡 Click to View Answer

Trace:

Loop 1: total = 0 + 2 = 2
Loop 2: total = 2 + 4 = 6
Loop 3: total = 6 + 6 = 12
Loop 4: total = 12 + 8 = 20
Average: 20 / 4 = 5.0

Answer: 5.0

Note: Division / always returns a float in Python 3, so the answer is 5.0 not 5.

Key Pattern 3: Tuple Operations (MUST KNOW!)

Keywords: Tuple = Immutable list, 创建后不可修改, 用()定义

Problem 1.5a: Basic Tuple Index

t = ("study", "exercises", "exam")
print(t[1])

💡 Click to View Answer

Index map:

Element:  "study"  "exercises"  "exam"
Index:       0         1          2

Answer: exercises

Keywords: Tuple索引从0开始, 和list一样

Problem 1.5b: Tuple with len()

t = ("A", "B", "C")
print(len(t))

💡 Click to View Answer

Answer: 3

Keywords: len()数元素个数, tuple和list用法相同

Problem 1.5c: Tuple Negative Indexing

t = ("A", "B", "C")
print(t[-1])

💡 Click to View Answer

Negative index map:

Element:  "A"   "B"   "C"
Negative:  -3    -2    -1

Answer: C

Keywords: -1是最后一个元素, 负索引从右往左数

Problem 1.5d: Tuple Slicing

t = (1, 2, 3, 4, 5)
print(t[1:4])

💡 Click to View Answer

Slice rule: 左闭右开 (left-inclusive, right-exclusive)

Answer: (2, 3, 4)

Keywords: t[1:4]取index 1,2,3 (不含4), 返回的还是tuple

Problem 1.5e: Tuple Repetition

t = (1, 2)
print(t * 3)

💡 Click to View Answer

Answer: (1, 2, 1, 2, 1, 2)

Keywords: *重复操作, 和字符串类似

Problem 1.5f: Tuple is Immutable (TRICKY!)

t = (1, 2, 3)
t[0] = 100
print(t)

💡 Click to View Answer

Answer: TypeError (程序报错!)

Keywords: Tuple是immutable(不可变), 创建后不能修改元素

对比: List是mutable(可变), 可以修改元素

lst = [1, 2, 3]
lst[0] = 100  # ✅ 正常工作

Problem 1.5g: For Loop with Tuple

t = (2, 4, 6)
for x in t:
    print(x)

💡 Click to View Answer

Answer:

2
4
6

Keywords: Tuple支持for遍历, 和list完全一样

Problem 1.5h: List of Tuples (套娃题型!)

data = [("Ann", 80), ("Bob", 60)]
print(data[1])

💡 Click to View Answer

Key: 外层是list, 每个元素是tuple

Answer: ('Bob', 60)

Keywords: data[1]取list的第1个元素(整个tuple)

Problem 1.5i: Nested Indexing (双重索引)

data = [("Ann", 80), ("Bob", 60)]
print(data[1][0])

💡 Click to View Answer

Step-by-step:

data[1] = ("Bob", 60)
("Bob", 60)[0] = "Bob"

Answer: Bob

Keywords: 双重索引=套娃, 先取外层再取内层

Problem 1.5j: Tuple Unpacking with For Loop

data = [("Ann", 80), ("Bob", 60)]
for name, score in data:
    print(name)

💡 Click to View Answer

Key: Tuple自动解包, name和score分别接收tuple的两个元素

Answer:

Ann
Bob

Keywords: Tuple解包, 变量数量必须匹配tuple元素数量

Problem 1.5k: Mixed Tuple and List

data = [(1, 2), (3, 4), (5, 6)]
print(data[2][1])

💡 Click to View Answer

Step-by-step:

data[2] = (5, 6) (第3个tuple)
(5, 6)[1] = 6 (tuple的第2个元素)

Answer: 6

Problem 1.5l: in Operator with Tuple

t = ("X", "Y", "Z")
if "Y" in t:
    print("Y")
else:
    print("N")

💡 Click to View Answer

Answer: Y

Keywords: in检查元素是否存在, tuple和list都支持

Key Pattern 4: Function Basics (MUST KNOW!)

Keywords: def定义函数, return返回结果并结束函数, print负责输出

Problem 1.6a: Basic Function

def f(x):
    return x * 2
 
print(f(3))

💡 Click to View Answer

Step-by-step:

调用f(3), x=3
return 3*2 = 6
print(6)

Answer: 6

Keywords: 参数传值, return返回计算结果

Problem 1.6b: Function Without Print (TRICKY!)

def f(x):
    return x * 2
 
f(3)

💡 Click to View Answer

Answer: None (无输出!)

Keywords: return只返回值, 不负责输出! 没有print就没有显示!

关键区别:

return = 返回结果并结束函数 (不显示)
print = 输出到屏幕 (显示)
调用函数 ≠ 自动输出

Problem 1.6c: Multiple Parameters

def add(a, b):
    return a + b
 
print(add(2, 5))

💡 Click to View Answer

Answer: 7

Keywords: 多参数用逗号分隔, 2+5=7

Problem 1.6d: Function with Arithmetic

def f(x):
    return x + 1
 
print(f(2) + f(3) * 2)

💡 Click to View Answer

Step-by-step:

f(2) = 2+1 = 3
f(3) = 3+1 = 4
3 + 4*2 = 3 + 8 = 11 (乘法优先!)

Answer: 11

Keywords: 函数返回值参与运算, 遵守算术优先级

Problem 1.6e: Boolean Function

def is_even(n):
    return n % 2 == 0
 
print(is_even(5))

💡 Click to View Answer

Step-by-step:

5 % 2 = 1 (余数)
1 == 0? False

Answer: False

Keywords: Boolean函数返回True/False, %取余数

Problem 1.6f: Function with If (常见混合题型)

def check(n):
    if n > 10:
        print("Big")
    else:
        print("Small")
    return None
 
result = check(12)
print(result)

💡 Click to View Answer

Step-by-step:

check(12): 12>10成立, print("Big")
return None
print(result) → print(None)

Answer:

Big
None

Keywords: 函数内的print会执行, return None也会被打印

Problem 1.6g: Function with For Loop

def sum_list(a):
    s = 0
    for x in a:
        s += x
    return s
 
print(sum_list([1, 2, 3]))

💡 Click to View Answer

Trace:

s=0, x=1: s=0+1=1
x=2: s=1+2=3
x=3: s=3+3=6

Answer: 6

Keywords: 函数参数可以是list, 遍历累加

Problem 1.6h: Function Returning String

def grade(m):
    if m >= 50:
        return "pass"
    else:
        return "fail"
 
print(grade(45))

💡 Click to View Answer

Step-by-step:

grade(45): 45>=50? False
return "fail"

Answer: fail

Keywords: return可以返回任何类型, 包括字符串

Problem 1.6i: Nested Function Call (可能超纲)

def f(x):
    return x + 1
 
def g(x):
    return f(x) * 2
 
print(g(3))

💡 Click to View Answer

Step-by-step:

g(3) 调用 f(3)
f(3) = 3+1 = 4
g(3) = 4 * 2 = 8

Answer: 8

Keywords: 函数嵌套调用, 先执行内层函数

Key Pattern 5: String Slicing (Left-Inclusive, Right-Exclusive)

Problem 1.6: String Slice

s = "ABW505"
print(s[1:5])

💡 Click to View Answer

Index map:

Character:  A   B   W   5   0   5
Index:      0   1   2   3   4   5

s[1:5] → indices 1, 2, 3, 4 (NOT including 5)

Answer: BW50

Problem 1.7: Negative Indexing

text = "Hello World"
print(text[-5:-1])

💡 Click to View Answer

Index map:

Character: H   e   l   l   o       W   o   r   l   d
Positive:  0   1   2   3   4   5   6   7   8   9   10
Negative:-11 -10  -9  -8  -7  -6  -5  -4  -3  -2  -1

text[-5:-1] → from 'W' (index -5) to 'l' (index -2, NOT including -1)

Answer: Worl

Key Pattern 4: List Operations

Problem 1.8: Slice Assignment (TRICKY!)

numbers = [10, 20, 30, 40, 50]
numbers[1:4] = [100]
print(len(numbers))
print(numbers[2])

💡 Click to View Answer

Step-by-step:

Original: [10, 20, 30, 40, 50]
numbers[1:4] selects [20, 30, 40] (3 elements)
Replace with [100] (1 element)
Result: [10, 100, 50]
Length: 3
numbers[2] = 50

Answers:

len(numbers) → 3
numbers[2] → 50

Key concept: Slice assignment can change list size! Replacing 3 elements with 1 element reduces length by 2.

Problem 1.9: List Reference vs Copy

a = [1, 2, 3]
b = a
b.append(4)
print(a)
print(a is b)

💡 Click to View Answer

Key concept: b = a creates a REFERENCE, not a copy!

a and b point to the SAME list object
Modifying b also modifies a
a is b → True (same object in memory)

Answers:

print(a) → [1, 2, 3, 4]
print(a is b) → True

To create an independent copy: Use b = a.copy() or b = a[:]

Key Pattern 5: Functions with Default Arguments

Problem 1.10: Keyword Arguments

def mystery(a, b=5, c=10):
    return a * 2 + b - c
 
result = mystery(3, c=4)
print(result)

💡 Click to View Answer

Step-by-step:

a = 3 (positional argument)
b = 5 (uses default, NOT overridden)
c = 4 (keyword argument overrides default)
Calculation: 3 * 2 + 5 - 4 = 6 + 5 - 4 = 7

Answer: 7

Key concept: Keyword arguments let you skip over default parameters.

Key Pattern 6: Loops and Range

Problem 1.11: Range with Accumulator

total = 0
for i in range(1, 4):
    total += i
print(total)

💡 Click to View Answer

range(1, 4) generates: 1, 2, 3 (NOT including 4)

Accumulation: 0 + 1 + 2 + 3 = 6

Answer: 6

Problem 1.12: Break Statement

for i in range(5):
    if i == 2:
        break
    print(i)

💡 Click to View Answer

i=0: Print 0
i=1: Print 1
i=2: Break! Exit loop immediately

Answer:

0
1

Key Pattern 7: List Comprehension

Problem 1.13: Filtered List Comprehension

nums = [1, 2, 3, 4, 5]
result = [x**2 for x in nums if x % 2 == 1]
print(result)
print(sum(result))

💡 Click to View Answer

Step-by-step:

Filter odd numbers: 1, 3, 5 (where x % 2 == 1)
Square each: 1², 3², 5² = 1, 9, 25
Result: [1, 9, 25]
Sum: 1 + 9 + 25 = 35

Answers:

result → [1, 9, 25]
sum(result) → 35

Pattern: [expression for item in iterable if condition]

📝 Q2: Code Writing (30 points - Choose 3 of 5)

Template 1: Menu with List (MUST MEMORIZE!)

Problem: Write a Python program that displays this menu repeatedly:

Add a number to the list
Display the list
Exit

💡 Click to View Verified Answer

# Initialize empty list to store numbers
data = []
 
# Main program loop - runs until user chooses to exit
while True:
    # Display menu options with clear prompts
    print("\n--- MENU ---")
    print("1. Add a number to the list")
    print("2. Display the list")
    print("3. Exit")
    
    # Get user choice with prompt (IMPORTANT: include prompt text!)
    choice = input("Enter your choice (1/2/3): ")
    
    # Process user choice
    if choice == "1":
        # Option 1: Add number
        # Use try-except to handle invalid input gracefully
        try:
            num = int(input("Enter a number to add: "))
            data.append(num)
            print(f"Added {num} to the list.")
        except ValueError:
            print("Invalid input! Please enter a valid integer.")
    
    elif choice == "2":
        # Option 2: Display list
        if len(data) == 0:
            print("The list is empty.")
        else:
            print(f"Current list: {data}")
    
    elif choice == "3":
        # Option 3: Exit program
        print("Goodbye!")
        break
    
    else:
        # Handle invalid menu choice
        print("Invalid choice! Please enter 1, 2, or 3.")

Key improvements over the original buggy version:

✅ Added prompt text to input() - users know what to enter
✅ Added try-except for error handling - won't crash on invalid input
✅ Used string comparison instead of int - avoids crash if user enters text
✅ Added feedback messages - users know what happened
✅ Added empty list check - better user experience

ORIGINAL BUGGY VERSION (what was wrong):

# PROBLEMATIC CODE - DO NOT USE IN EXAM
data = []
while True:
    print("1.Add")
    print("2.Show")
    print("3.Exit")
    c = int(input())  # BUG: Crashes if user enters non-integer!
    
    if c == 1:
        data.append(int(input()))  # BUG: Crashes on invalid input, no prompt!
    elif c == 2:
        print(data)
    elif c == 3:
        break
# Missing: else clause, error handling, user prompts

Why it crashes: int(input()) without try-except will throw ValueError if user enters anything that's not a number (like pressing Enter, or typing "abc").

✍️ 手写精简版 (HANDWRITING VERSION)

只保留核心逻辑，去掉所有注释和错误处理：

data = []
while True:
    print("1.Add 2.Show 3.Exit")
    c = input("Choice: ")
    if c == "1":
        data.append(int(input("Num: ")))
    elif c == "2":
        print(data)
    elif c == "3":
        break

手写要点: 约10行, 必须有while True + break退出

Template 2: List with Sentinel Value (-1)

Problem: Write a Python program that:

Allows user to enter integers
Stops when user enters -1
Prints the minimum, maximum, and average

💡 Click to View Verified Answer

# Initialize empty list to store user's numbers
nums = []
 
print("Enter integers. Enter -1 to stop.")
 
# Main input loop
while True:
    try:
        # Get integer input with clear prompt
        n = int(input("Enter a number (-1 to stop): "))
        
        # Check for sentinel value
        if n == -1:
            break  # Exit loop when user enters -1
        
        # Add valid number to list
        nums.append(n)
        
    except ValueError:
        # Handle non-integer input
        print("Invalid input! Please enter an integer.")
 
# Calculate and display statistics
# IMPORTANT: Check if list is empty to avoid division by zero!
if len(nums) == 0:
    print("No numbers were entered.")
else:
    minimum = min(nums)
    maximum = max(nums)
    average = sum(nums) / len(nums)
    
    print(f"\nResults:")
    print(f"Minimum: {minimum}")
    print(f"Maximum: {maximum}")
    print(f"Average: {average:.2f}")  # .2f for 2 decimal places

Sample run:

Enter integers. Enter -1 to stop.
Enter a number (-1 to stop): 5
Enter a number (-1 to stop): 10
Enter a number (-1 to stop): 3
Enter a number (-1 to stop): -1

Results:
Minimum: 3
Maximum: 10
Average: 6.00

Edge case handling: Always check if list is empty before calculating statistics! min([]) and max([]) will raise ValueError, and sum([])/len([]) will raise ZeroDivisionError.

Template 3: Dictionary Operations

Problem: Write a Python program that:

Stores student names and marks in a dictionary
Allows multiple entries
Prints the average mark

💡 Click to View Verified Answer

# Initialize empty dictionary: {name: mark}
students = {}
 
print("Student Grade Recorder")
print("Enter student names and marks. Type 'stop' as name to finish.")
 
# Main input loop
while True:
    # Get student name with prompt
    name = input("\nEnter student name (or 'stop' to finish): ")
    
    # Check for stop condition (case-insensitive)
    if name.lower() == "stop":
        break
    
    # Check for empty name
    if name.strip() == "":
        print("Name cannot be empty!")
        continue
    
    # Get mark with error handling
    try:
        mark = int(input(f"Enter mark for {name}: "))
        
        # Optional: Validate mark range
        if mark < 0 or mark > 100:
            print("Warning: Mark is outside 0-100 range.")
        
        # Store in dictionary
        students[name] = mark
        print(f"Recorded: {name} = {mark}")
        
    except ValueError:
        print("Invalid mark! Please enter a number.")
 
# Calculate and display average
if len(students) == 0:
    print("\nNo students were recorded.")
else:
    # Get all marks using .values()
    all_marks = students.values()
    average = sum(all_marks) / len(all_marks)
    
    print(f"\n--- Student Records ---")
    for name, mark in students.items():
        print(f"{name}: {mark}")
    print(f"\nAverage mark: {average:.2f}")

Key dictionary operations:

dict.values() - get all values (marks)
dict.items() - get all key-value pairs
dict.keys() - get all keys (names)

✍️ 手写精简版 (HANDWRITING VERSION)

students = {}
while True:
    name = input("Name (stop to end): ")
    if name == "stop":
        break
    mark = int(input("Mark: "))
    students[name] = mark
# Calculate average
avg = sum(students.values()) / len(students)
print("Average:", avg)

手写要点: 约10行, dict存储, .values()取所有分数

Template 4: Grade Calculator (Decision Structure)

Problem: Write a function grade_calculator(score) that:

Returns letter grade: 90+ → "A", 80+ → "B", 70+ → "C", 60+ → "D", <60 → "F"
Returns "Invalid" for negative or > 100

💡 Click to View Verified Answer

def grade_calculator(score):
    """
    Convert numeric score to letter grade.
    
    Args:
        score: Numeric score (expected 0-100)
    
    Returns:
        str: Letter grade (A/B/C/D/F) or "Invalid"
    """
    # FIRST: Check for invalid input
    # Must check this BEFORE checking grade ranges
    if score < 0 or score > 100:
        return "Invalid"
    
    # Check grades from highest to lowest
    # Using elif ensures only ONE condition is matched
    if score >= 90:
        return "A"
    elif score >= 80:
        return "B"
    elif score >= 70:
        return "C"
    elif score >= 60:
        return "D"
    else:
        return "F"
 
 
# Test the function
if __name__ == "__main__":
    test_scores = [95, 85, 73, 65, 45, -5, 105]
    for s in test_scores:
        print(f"Score {s} → Grade {grade_calculator(s)}")

Output:

Score 95 → Grade A
Score 85 → Grade B
Score 73 → Grade C
Score 65 → Grade D
Score 45 → Grade F
Score -5 → Grade Invalid
Score 105 → Grade Invalid

Common mistakes:

Not checking invalid input FIRST
Using multiple if instead of elif (would return wrong grade)
Checking in wrong order (60+ before 90+)

✍️ 手写精简版 (HANDWRITING VERSION)

def grade(score):
    if score < 0 or score > 100:
        return "Invalid"
    if score >= 90: return "A"
    if score >= 80: return "B"
    if score >= 70: return "C"
    if score >= 60: return "D"
    return "F"

手写要点: 约8行, 先判断invalid, 从高到低判断

Template 5: Boolean Function (PREDICTED TOPIC!)

Problem: Write a function that returns True/False based on conditions (function + and/or/not)

💡 Click to View Examples

Example 1: Check if number is in range [10, 50]

def in_range(n):
    return n >= 10 and n <= 50
 
print(in_range(25))  # True
print(in_range(5))   # False

Example 2: Check if all three numbers are positive

def all_positive(a, b, c):
    return a > 0 and b > 0 and c > 0
 
print(all_positive(1, 2, 3))   # True
print(all_positive(1, -2, 3))  # False

Example 3: Check if at least one is even

def has_even(a, b, c):
    return a % 2 == 0 or b % 2 == 0 or c % 2 == 0
 
print(has_even(1, 3, 5))  # False
print(has_even(1, 2, 5))  # True

Example 4: Check if string is valid password

def is_valid_password(pwd):
    # At least 8 characters and contains digit
    has_length = len(pwd) >= 8
    has_digit = any(c.isdigit() for c in pwd)
    return has_length and has_digit
 
print(is_valid_password("abc12345"))  # True
print(is_valid_password("short1"))    # False

Boolean operators:

and = 两个都要成立
or = 至少一个成立
not = 取反

Keywords: return True/False, 条件组合

✍️ 手写精简版 (HANDWRITING VERSION)

def is_valid(x, y, z):
    return x > 0 and y > 0 and z > 0

手写要点: 1行return即可, 用and/or组合条件

Template 6: Prime Number Check

Problem: Write a function is_prime(num) that returns True if prime, False otherwise.

💡 Click to View Verified Answer

def is_prime(num):
    """
    Check if a number is prime.
    
    A prime number is:
    - Greater than 1
    - Only divisible by 1 and itself
    
    Args:
        num: Integer to check
    
    Returns:
        bool: True if prime, False otherwise
    """
    # Numbers less than 2 are not prime
    # (0, 1, and negative numbers)
    if num < 2:
        return False
    
    # 2 is the only even prime
    if num == 2:
        return True
    
    # All other even numbers are not prime
    if num % 2 == 0:
        return False
    
    # Check odd divisors up to square root of num
    # Why sqrt? If n = a × b, one of a,b must be ≤ √n
    # We use int(num ** 0.5) + 1 to include the square root
    for i in range(3, int(num ** 0.5) + 1, 2):  # Step by 2 (odd numbers only)
        if num % i == 0:
            return False  # Found a divisor, not prime
    
    return True  # No divisors found, it's prime
 
 
# Test the function
if __name__ == "__main__":
    test_nums = [1, 2, 3, 7, 10, 11, 25, 29]
    for n in test_nums:
        result = "Prime" if is_prime(n) else "Not Prime"
        print(f"{n}: {result}")

Output:

1: Not Prime
2: Prime
3: Prime
7: Prime
10: Not Prime
11: Prime
25: Not Prime
29: Prime

Optimization: Only checking up to √n reduces time complexity from O(n) to O(√n).

Template 6: Fibonacci Sequence

Problem: Write a function fibonacci(n) that returns the first n Fibonacci numbers as a list.

💡 Click to View Verified Answer

def fibonacci(n):
    """
    Generate the first n Fibonacci numbers.
    
    Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, ...
    Each number is the sum of the two preceding numbers.
    
    Args:
        n: Number of Fibonacci numbers to generate
    
    Returns:
        list: First n Fibonacci numbers
    """
    # Handle edge cases
    if n <= 0:
        return []  # Empty list for invalid input
    if n == 1:
        return [0]  # Only the first number
    
    # Start with first two Fibonacci numbers
    result = [0, 1]
    
    # Generate remaining numbers
    for i in range(2, n):
        # Each new number = sum of last two
        next_num = result[-1] + result[-2]  # Use negative indexing
        result.append(next_num)
    
    return result
 
 
# Test the function
if __name__ == "__main__":
    for count in [0, 1, 5, 10]:
        print(f"fibonacci({count}) = {fibonacci(count)}")

Output:

fibonacci(0) = []
fibonacci(1) = [0]
fibonacci(5) = [0, 1, 1, 2, 3]
fibonacci(10) = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Template 7: Remove Duplicates (Preserve Order)

Problem: Write a function that removes duplicates from a list while preserving the order of first occurrence.

💡 Click to View Verified Answer

def remove_duplicates(lst):
    """
    Remove duplicate elements while preserving first occurrence order.
    
    Example: [1, 2, 2, 3, 1, 4] → [1, 2, 3, 4]
    
    Args:
        lst: Input list with possible duplicates
    
    Returns:
        list: New list with duplicates removed
    """
    seen = []  # Track items we've already seen
    
    for item in lst:
        if item not in seen:  # Only add if not seen before
            seen.append(item)
    
    return seen
 
 
# Alternative using dict (Python 3.7+ preserves insertion order)
def remove_duplicates_v2(lst):
    """
    Remove duplicates using dictionary (more efficient for large lists).
    dict.fromkeys() preserves first occurrence order.
    """
    return list(dict.fromkeys(lst))
 
 
# Test both versions
if __name__ == "__main__":
    test = [1, 2, 2, 3, 1, 4, 5, 3, 2]
    print(f"Original: {test}")
    print(f"Method 1: {remove_duplicates(test)}")
    print(f"Method 2: {remove_duplicates_v2(test)}")

Output:

Original: [1, 2, 2, 3, 1, 4, 5, 3, 2]
Method 1: [1, 2, 3, 4, 5]
Method 2: [1, 2, 3, 4, 5]

Why not use set()? Sets don't preserve order! list(set([1, 2, 2, 3, 1, 4])) might give [1, 2, 3, 4] but order is not guaranteed.

Template 8: Exception Handling

Problem: Write a program that repeatedly asks for a nonzero integer and calculates its reciprocal, handling invalid inputs.

💡 Click to View Verified Answer

def get_reciprocal():
    """
    Get a nonzero integer from user and calculate its reciprocal.
    Handles ValueError (non-integer) and ZeroDivisionError (zero input).
    """
    while True:
        try:
            # Get input from user
            n = int(input("Enter a nonzero integer: "))
            
            # Calculate reciprocal (will raise ZeroDivisionError if n=0)
            reciprocal = 1 / n
            
            # If we get here, input was valid
            print(f"The reciprocal of {n} is {reciprocal:.3f}")
            break  # Exit loop on success
            
        except ValueError:
            # int() failed - input was not a valid integer
            print("Error: You did not enter a valid integer. Try again.")
            
        except ZeroDivisionError:
            # Division by zero
            print("Error: You entered zero. Cannot divide by zero. Try again.")
 
 
# Run the function
if __name__ == "__main__":
    get_reciprocal()

Sample run:

Enter a nonzero integer: abc
Error: You did not enter a valid integer. Try again.
Enter a nonzero integer: 0
Error: You entered zero. Cannot divide by zero. Try again.
Enter a nonzero integer: 4
The reciprocal of 4 is 0.250

📝 Q3: Machine Learning - Theory & Code (25 points)

Flowchart Symbols (MUST KNOW!)

Keywords: 流程图用于设计和解释程序逻辑

💡 Click to View All 5 Symbols

Symbol	Shape	Name	Purpose
⬭	Oval	Terminal	Start/End of the flowchart
▱	Parallelogram	I/O	Input/Output operations (e.g., enter values, display results)
▭	Rectangle	Process	Processing/Calculation (e.g., x = a + b)
◇	Diamond	Decision	Condition check (Yes/No branches)
→	Arrow	Flow Line	Direction of flow in program logic

Example question: Draw a flowchart to find the largest among three numbers (a, b, c).

Flowchart structure:

[Start] → [Input a, b, c] → <a > b?> 
                               ↓Yes        ↓No
                           <a > c?>    <b > c?>
                           ↓Yes  ↓No   ↓Yes  ↓No
                         [max=a][max=c][max=b][max=c]
                               ↓ ↓ ↓ ↓
                         [Output max] → [End]

Key points for exam:

Start/End: 必须有开始和结束符号
Input: 在处理前获取输入
Decision: 用菱形表示条件判断，有Yes/No两个分支
Process: 矩形框内写计算操作
Arrows: 所有符号用箭头连接，指示流程方向

Algorithm Comparison Table (MUST KNOW!)

Keywords: 根据数据特征选择合适的模型

Algorithm	Best For	Pros	Cons	When to Use?
Naive Bayes	Small, low-dimensional	Fast, simple, low variance	Independence assumption unrealistic	Compare probabilities, pick highest
Logistic Regression	Small-medium data	Interpretable, stable	Linear separation only	Linear relationship, probability output
Decision Tree	Small-medium data	Easy to understand, visual	Prone to overfitting	Need clear rules, explainable
KNN	Small data	Simple, no training	Slow, sensitive to noise	Small data, low dimensions, classification
SVM	Small, high-dimensional	High accuracy, good generalization	Complex, hard to tune	High-dimensional data, margin-based
Random Forest	Medium data	Accurate, resists overfitting	Less interpretable	Improved bagging decision tree

💡 Model Selection Quick Rules (考试速查)

Q: 小数据、低维度、分类问题，选哪个模型？ A: Naive Bayes - 表现好、方差低、需要数据少、专为分类设计

Q: Random Forest比Decision Tree好在哪？ A: 减少过拟合 - 通过组合多个决策树(bagging)来提高准确性

Q: K值增大会怎样？ A: K变大 → 方差减小(更稳定) + 偏差增大(更偏)

Q: SVM适合什么数据？ A: 高维数据强，大数据慢 - Works well in high-dimensional spaces, but computationally expensive for large datasets

Q: Why use Encoder before ML model? A: ML需要数值输入 - Machine learning models require numerical input; encoders transform categorical data into numbers

Pandas Basics

💡 Click to View Common Operations

import pandas as pd
 
# ========== Reading Data ==========
# Read CSV file
df = pd.read_csv("data.csv")
 
# Display first/last rows
print(df.head())   # First 5 rows
print(df.tail())   # Last 5 rows
print(df.shape)    # (rows, columns)
 
# ========== Handling Missing Values ==========
# Check for missing values
print(df.isnull().sum())  # Count of nulls per column
 
# Fill missing values
df['Age'].fillna(df['Age'].mean(), inplace=True)      # Fill with mean
df['Age'].fillna(df['Age'].median(), inplace=True)    # Fill with median
df['Salary'].fillna(50000, inplace=True)              # Fill with specific value
 
# Drop rows with missing values
df.dropna(inplace=True)
 
# ========== Selecting Data ==========
# Select single column
ages = df['Age']
 
# Select multiple columns
subset = df[['Name', 'Age']]
 
# Filter rows
adults = df[df['Age'] >= 18]
 
# Multiple conditions (use & for AND, | for OR)
result = df[(df['Age'] >= 18) & (df['Department'] == 'IT')]
 
# ========== Grouping ==========
# Group by and aggregate
avg_salary = df.groupby('Department')['Salary'].mean()

LabelEncoder for Categorical Data

💡 Click to View Example

from sklearn.preprocessing import LabelEncoder
import pandas as pd
 
# Sample data with categorical columns
data = {
    'Color': ['Red', 'Blue', 'Green', 'Red', 'Blue'],
    'Size': ['Small', 'Medium', 'Large', 'Medium', 'Small']
}
df = pd.DataFrame(data)
 
# Create LabelEncoder instance
le = LabelEncoder()
 
# Encode each categorical column
# LabelEncoder sorts values alphabetically then assigns 0, 1, 2...
for col in df.columns:
    if df[col].dtype == 'object':  # Check if column is string/object type
        df[col] = le.fit_transform(df[col])
 
print(df)
# Color encoding: Blue=0, Green=1, Red=2 (alphabetical)
# Size encoding: Large=0, Medium=1, Small=2 (alphabetical)

LabelEncoder mapping (always alphabetical):

Original	Encoded
Blue	0
Green	1
Red	2

Train-Test Split

💡 Click to View Example

from sklearn.model_selection import train_test_split
 
# Assume X = features, y = target variable
X = df.drop('target', axis=1)  # All columns except target
y = df['target']               # Target column only
 
# Split data: 80% training, 20% testing
X_train, X_test, y_train, y_test = train_test_split(
    X, y, 
    test_size=0.2,      # 20% for testing
    random_state=42     # For reproducibility
)
 
print(f"Training set size: {len(X_train)}")
print(f"Testing set size: {len(X_test)}")

Why train-test split?

Evaluate model on UNSEEN data
Detect overfitting (memorizing training data)
Simulate real-world usage
Get honest performance estimate

fit() vs predict()

💡 Click to View Explanation

Method	Purpose	When Used
`fit()`	Train the model	Once, on training data only
`predict()`	Apply the model	On test/new data
`fit_transform()`	Fit and transform in one step	For preprocessing (scaler, encoder)

Important:

Use fit_transform() on training data
Use transform() only on test data (NOT fit_transform!)

# CORRECT workflow for scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)  # Fit AND transform
X_test_scaled = scaler.transform(X_test)         # Transform only (no fit!)

SVM Implementation

💡 Click to View Complete Code

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report
 
# Step 1: Load data
df = pd.read_csv("data.csv")
 
# Step 2: Encode categorical features
le = LabelEncoder()
for col in df.select_dtypes(include=['object']).columns:
    if col != 'target':  # Don't encode target yet if needed later
        df[col] = le.fit_transform(df[col])
 
# Step 3: Separate features (X) and target (y)
X = df.drop('target', axis=1)
y = df['target']
 
# Step 4: Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)
 
# Step 5: Scale features (IMPORTANT for SVM!)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)  # Fit on training data
X_test_scaled = scaler.transform(X_test)         # Transform only on test data
 
# Step 6: Create and train SVM model
svm_model = SVC(kernel='rbf', random_state=42)  # RBF kernel is default
svm_model.fit(X_train_scaled, y_train)
 
# Step 7: Make predictions
y_pred = svm_model.predict(X_test_scaled)
 
# Step 8: Evaluate
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

SVM Key Points:

Scaling is REQUIRED - SVM is sensitive to feature magnitudes
Kernel trick - transforms data to higher dimensions for separation
Common kernels: 'linear', 'rbf' (Gaussian), 'poly' (polynomial)

Random Forest Implementation

💡 Click to View Complete Code

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
 
# Step 1: Load and prepare data
df = pd.read_csv("data.csv")
 
# Step 2: Handle categorical (if needed)
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
for col in df.select_dtypes(include=['object']).columns:
    df[col] = le.fit_transform(df[col])
 
# Step 3: Split features and target
X = df.drop('target', axis=1)
y = df['target']
 
# Step 4: Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)
 
# Step 5: Create and train Random Forest
# Note: No scaling needed for tree-based models!
rf_model = RandomForestClassifier(
    n_estimators=100,    # Number of trees
    max_depth=None,      # No limit on depth
    random_state=42
)
rf_model.fit(X_train, y_train)
 
# Step 6: Predictions and evaluation
y_train_pred = rf_model.predict(X_train)
y_test_pred = rf_model.predict(X_test)
 
print(f"Training Accuracy: {accuracy_score(y_train, y_train_pred):.2f}")
print(f"Testing Accuracy: {accuracy_score(y_test, y_test_pred):.2f}")
 
# Step 7: Feature importance
importance = pd.DataFrame({
    'feature': X.columns,
    'importance': rf_model.feature_importances_
}).sort_values('importance', ascending=False)
print("\nFeature Importance:")
print(importance)

Random Forest Key Points:

Bagging: Creates multiple trees on different data subsets
Reduces overfitting: Averaging many trees is more stable
No scaling needed: Tree-based methods don't need scaling
Feature importance: Shows which features matter most

StandardScaler vs MinMaxScaler

💡 Click to View Comparison

Scaler	Formula	Output Range	Best For
StandardScaler	(x - mean) / std	Mean=0, Std=1	SVM, Logistic Regression, data with outliers
MinMaxScaler	(x - min) / (max - min)	[0, 1]	Neural Networks, KNN, bounded features

Quick rule:

SVM, Linear models → StandardScaler
Neural networks, images → MinMaxScaler

from sklearn.preprocessing import StandardScaler, MinMaxScaler
 
# StandardScaler: Z-score normalization
scaler1 = StandardScaler()
X_standard = scaler1.fit_transform(X)
 
# MinMaxScaler: Scale to [0, 1]
scaler2 = MinMaxScaler()
X_minmax = scaler2.fit_transform(X)

📝 Q4: Naive Bayes & Decision Tree (25 points)

Bayes' Theorem Formula

$$P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}$$

Where:

$P(A|B)$ = Posterior probability (what we want)
$P(B|A)$ = Likelihood
$P(A)$ = Prior probability
$P(B)$ = Evidence (normalizing constant)

Naive Bayes Calculation Example

💡 Click to View Worked Example

Dataset: Classify emails as Spam or Not Spam

Email	Contains "Free"	Contains "Winner"	Spam?
1	Yes	Yes	Spam
2	Yes	No	Spam
3	No	Yes	Spam
4	No	No	Not Spam
5	Yes	No	Not Spam
6	No	No	Not Spam

Question: New email has "Free"=Yes, "Winner"=No. Is it Spam?

Step 1: Calculate Priors

P(Spam) = 3/6 = 0.5
P(Not Spam) = 3/6 = 0.5

Step 2: Calculate Likelihoods

For Spam emails (1, 2, 3):

P(Free=Yes | Spam) = 2/3 (emails 1, 2)
P(Winner=No | Spam) = 1/3 (email 2 only)

For Not Spam emails (4, 5, 6):

P(Free=Yes | Not Spam) = 1/3 (email 5)
P(Winner=No | Not Spam) = 3/3 = 1 (all three)

Step 3: Calculate Unnormalized Posteriors

$P(Spam | evidence) \propto P(Spam) \times P(Free=Yes|Spam) \times P(Winner=No|Spam)$ $= 0.5 \times \frac{2}{3} \times \frac{1}{3} = 0.111$

$P(Not Spam | evidence) \propto 0.5 \times \frac{1}{3} \times 1 = 0.167$

Step 4: Normalize $P(Spam) = \frac{0.111}{0.111 + 0.167} = \frac{0.111}{0.278} = 0.40 = 40%$

Prediction: NOT SPAM (40% < 50%)

Gini Index Formula & Calculation

$$Gini = 1 - \sum_{i=1}^{n} p_i^2$$

Where $p_i$ is the proportion of class $i$ in the node.

💡 Click to View Worked Example

Dataset: 20 emails (12 Spam, 8 Not Spam)

Split by "Contains Free":

Contains "free": 10 emails (9 Spam, 1 Not Spam)
No "free": 10 emails (3 Spam, 7 Not Spam)

Step 1: Gini for "Contains Free" node (9S, 1N)

P(Spam) = 9/10 = 0.9
P(Not Spam) = 1/10 = 0.1
Gini = 1 - (0.9² + 0.1²) = 1 - (0.81 + 0.01) = 0.18

Step 2: Gini for "No Free" node (3S, 7N)

P(Spam) = 3/10 = 0.3
P(Not Spam) = 7/10 = 0.7
Gini = 1 - (0.3² + 0.7²) = 1 - (0.09 + 0.49) = 0.42

Step 3: Weighted Average Gini $Gini_{split} = \frac{10}{20} \times 0.18 + \frac{10}{20} \times 0.42$ $= 0.5 \times 0.18 + 0.5 \times 0.42 = 0.09 + 0.21 = 0.30$

Final Answer: Gini for this split = 0.30

Interpretation: Lower Gini = better split. Pure node has Gini = 0.

Information Gain (Entropy)

$$Entropy = -\sum_{i=1}^{n} p_i \log_2(p_i)$$

$$Information\ Gain = Entropy(parent) - \sum_{children} \frac{n_{child}}{n_{parent}} \times Entropy(child)$$

💡 Click to View Worked Example

Parent node: 3 Spam, 3 Not Spam (50/50 split)

Parent Entropy (perfect balance = maximum entropy): $H(parent) = -0.5 \log_2(0.5) - 0.5 \log_2(0.5)$ $= -0.5(-1) - 0.5(-1) = 0.5 + 0.5 = 1.0$

Child node "Free=Yes" (2 Spam, 1 Not Spam): $H = -\frac{2}{3} \log_2(\frac{2}{3}) - \frac{1}{3} \log_2(\frac{1}{3})$ $= 0.390 + 0.528 = 0.918$

Child node "Free=No" (1 Spam, 2 Not Spam): $H = -\frac{1}{3} \log_2(\frac{1}{3}) - \frac{2}{3} \log_2(\frac{2}{3}) = 0.918$

Weighted Entropy: $= \frac{3}{6}(0.918) + \frac{3}{6}(0.918) = 0.918$

Information Gain: $IG = 1.0 - 0.918 = 0.082$

Complete Entropy Calculation Example (EXAM FORMAT!)

💡 Click to View Full Decision Tree Example

Dataset: Predict if student will pass

GPA	Studied	Passed
Low	No	No
Low	Yes	No
Med	No	No
Med	Yes	Yes
High	No	Yes
High	Yes	Yes

Question: Calculate H(Passed), H(Passed|GPA), H(Passed|Studied), then draw decision tree.

Step 1: Calculate H(Passed) - 目标变量的熵

Passed=Yes: 3个 → P(Yes) = 3/6 = 0.5
Passed=No: 3个 → P(No) = 3/6 = 0.5

$H(Passed) = -0.5 \log_2(0.5) - 0.5 \log_2(0.5)$ $= -0.5 \times (-1) - 0.5 \times (-1) = 0.5 + 0.5 = 1.0$

Answer: H(Passed) = 1.0 (完美50/50分布 = 最大熵)

Step 2: Calculate H(Passed | GPA) - 按GPA分组的条件熵

GPA = Low (2条记录: 0 Yes, 2 No)

H(Low) = -0 \log_2(0) - 1 \log_2(1) = 0 (纯节点!)

GPA = Med (2条记录: 1 Yes, 1 No)

H(Med) = -0.5 \log_2(0.5) - 0.5 \log_2(0.5) = 1.0

GPA = High (2条记录: 2 Yes, 0 No)

H(High) = -1 \log_2(1) - 0 \log_2(0) = 0 (纯节点!)

Weighted Average: $H(Passed|GPA) = \frac{2}{6} \times 0 + \frac{2}{6} \times 1.0 + \frac{2}{6} \times 0$ $= 0 + 0.333 + 0 = 0.333$

Answer: H(Passed|GPA) = 0.333

Step 3: Calculate H(Passed | Studied) - 按Studied分组的条件熵

Studied = No (3条记录: 1 Yes, 2 No)

P(Yes) = 1/3, P(No) = 2/3
H(No) = -1/3 \log_2(1/3) - 2/3 \log_2(2/3)
= 0.528 + 0.390 = 0.918

Studied = Yes (3条记录: 2 Yes, 1 No)

P(Yes) = 2/3, P(No) = 1/3
H(Yes) = 0.918 (对称)

Weighted Average: $H(Passed|Studied) = \frac{3}{6} \times 0.918 + \frac{3}{6} \times 0.918 = 0.918$

Answer: H(Passed|Studied) = 0.918

Step 4: Compare Information Gain

IG(GPA) = H(Passed) - H(Passed|GPA) = 1.0 - 0.333 = 0.667 ✅ 更高!
IG(Studied) = H(Passed) - H(Passed|Studied) = 1.0 - 0.918 = 0.082

选GPA作为根节点 (信息增益更高)

Step 5: Draw Decision Tree

         [GPA?]
       /   |   \
    Low   Med   High
     ↓     ↓      ↓
   [No] [Studied?] [Yes]
          /    \
        No     Yes
         ↓       ↓
       [No]   [Yes]

Log值速查表 (考试可用计算器):

log₂(0.5) = -1
log₂(1) = 0
log₂(1/3) ≈ -1.585
log₂(2/3) ≈ -0.585

规则: 0 × log₂(0) = 0 (按约定)

Decision Tree Code Template

💡 Click to View Complete Code

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report
 
# Load data
df = pd.read_csv("data.csv")
 
# Encode categorical if needed
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
for col in df.select_dtypes(include=['object']).columns:
    df[col] = le.fit_transform(df[col])
 
# Split features and target
X = df.drop('target', axis=1)
y = df['target']
 
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)
 
# Create and train Decision Tree
# criterion='gini' is default (CART algorithm)
# criterion='entropy' uses Information Gain (ID3/C4.5)
dt_model = DecisionTreeClassifier(
    criterion='gini',    # or 'entropy'
    max_depth=5,         # Limit depth to prevent overfitting
    random_state=42
)
dt_model.fit(X_train, y_train)
 
# Evaluate
y_pred = dt_model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print(classification_report(y_test, y_pred))

🎯 Quick Reference Checklist

Python Essentials

Operator precedence: ** > *,/,//,% > +,-
String slicing: left-inclusive, right-exclusive
Floor division // rounds toward negative infinity
range(a,b) generates a to b-1
List assignment creates reference, not copy

Machine Learning Essentials

Bayes formula: P(A|B) = P(B|A) × P(A) / P(B)
Gini: 1 - Σ(pᵢ²)
Entropy: -Σ pᵢ log₂(pᵢ)
SVM: needs scaling, uses kernel trick
Random Forest: reduces overfitting via bagging
Decision Tree: uses Gini (CART) or Entropy (ID3)

💪 Good luck on your exam! 🎓

All code in this document has been verified and tested.