Troubleshooting Common Errors#
This guide helps you understand and fix common Python and pandas errors you might encounter in this course.
How to Read Error Messages#
Python error messages follow a pattern. Here’s an example:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-5-abc123def456> in <module>
1 # Select the age column
----> 2 data['Age']
KeyError: 'Age'
Reading the error:
Error Type (top):
KeyError- tells you what went wrongTraceback (middle): Shows which line caused the error (marked with
---->)Error Message (bottom):
'Age'- gives specific details
The arrow (---->) points to the exact line that caused the problem.
Common Error Types#
KeyError#
What it means: You’re trying to access a dictionary key or DataFrame column that doesn’t exist.
Common causes:
# ❌ Wrong column name (case sensitive!)
data['Age'] # Column is actually 'age' (lowercase)
# ❌ Typo in column name
data['ART_regimen'] # Column is actually 'ART'
# ❌ Column doesn't exist at all
data['nonexistent_column']
How to fix:
# 1. Check available columns
print(data.columns)
# 2. Use the exact column name (case matters!)
data['age'] # ✅ CORRECT
# 3. For complex names, copy-paste from the columns list
print(data.columns.tolist())
Real example from our course:
# ❌ WRONG
result = data['Processing_Domain_Z']
# ✅ CORRECT (check the actual column name)
print(data.columns) # Shows: 'processing_domain_z'
result = data['processing_domain_z']
TypeError#
What it means: You’re using an operation on the wrong type of data.
Common causes:
Example 1: Using + on incompatible types
# ❌ Can't add string and number
age = '25' + 5 # TypeError: can only concatenate str (not "int") to str
# ✅ Convert to same type first
age = int('25') + 5 # Returns 30
age = '25' + str(5) # Returns '255'
Example 2: Wrong bracket type
# ❌ Using () instead of [] for indexing
data('age') # TypeError: 'DataFrame' object is not callable
# ✅ Use square brackets
data['age']
Example 3: Missing quotes on strings
# ❌ Treating variable as string
data[data['sex'] == Male] # TypeError or NameError
# ✅ Add quotes for string values
data[data['sex'] == 'Male']
How to fix:
Check if you’re using the right bracket type (
[]vs())Ensure strings have quotes
Check data types:
type(variable)ordata.dtypes
SyntaxError#
What it means: Python can’t understand your code because of a typo or missing punctuation.
Common causes:
Missing closing bracket/parenthesis
# ❌ Missing closing bracket
subset = data[data['age'] > 30
# SyntaxError: unexpected EOF while parsing
# ✅ Add the closing bracket
subset = data[data['age'] > 30]
Missing comma
# ❌ Missing comma between list items
columns = ['age' 'sex' 'education']
# SyntaxError: invalid syntax
# ✅ Add commas
columns = ['age', 'sex', 'education']
Mismatched quotes
# ❌ Starting with " but ending with '
name = "Alice'
# SyntaxError: EOL while scanning string literal
# ✅ Match your quotes
name = "Alice"
name = 'Alice' # Both work, just be consistent
How to fix:
Look at the line indicated by the arrow (
---->)Count your brackets: each
(needs a), each[needs a]Check for missing commas in lists
Ensure quotes match (both single
'or both double")
AttributeError#
What it means: You’re trying to use a method or attribute that doesn’t exist for that object type.
Common causes:
Wrong method name (typo)
# ❌ Method doesn't exist (typo: groupby vs groupBy)
data.groupBy('ART')
# AttributeError: 'DataFrame' object has no attribute 'groupBy'
# ✅ Correct method name
data.groupby('ART')
Using DataFrame method on Series (or vice versa)
# ❌ .query() only works on DataFrames, not Series
ages = data['age']
young_ages = ages.query('age < 30')
# AttributeError: 'Series' object has no attribute 'query'
# ✅ Filter the DataFrame first, then select column
young_data = data.query('age < 30')
young_ages = young_data['age']
# OR use boolean indexing
young_ages = data[data['age'] < 30]['age']
Calling a result instead of a method
# ❌ Missing parentheses to call the method
data_copy = data.copy
# This assigns the function itself, not the result
# ✅ Add () to call the method
data_copy = data.copy()
How to fix:
Check spelling of method names (case matters!)
Verify you’re using a DataFrame vs Series method correctly
Don’t forget parentheses
()when calling methodsCheck the documentation:
help(data.groupby)or online docs
NameError#
What it means: You’re using a variable or function name that Python doesn’t recognize.
Common causes:
Typo in variable name
# ❌ Variable name misspelled
my_data = pd.read_csv('file.csv')
result = mydata['age']
# NameError: name 'mydata' is not defined
# ✅ Use exact variable name
result = my_data['age']
Forgot to import a library
# ❌ Using pd before importing pandas
data = pd.read_csv('file.csv')
# NameError: name 'pd' is not defined
# ✅ Import first
import pandas as pd
data = pd.read_csv('file.csv')
Variable not defined yet
# ❌ Using variable before creating it
print(result)
result = data['age'].mean()
# NameError: name 'result' is not defined
# ✅ Create variable first
result = data['age'].mean()
print(result)
How to fix:
Check spelling of variable names
Ensure you’ve run all necessary cells (especially imports!)
Make sure you’ve run cells in order (top to bottom)
In Colab, use
Runtime→Restart and Run Allto start fresh
IndexError#
What it means: You’re trying to access a position in a list or array that doesn’t exist.
Common causes:
# ❌ List only has 3 items (indices 0, 1, 2)
my_list = ['a', 'b', 'c']
print(my_list[3])
# IndexError: list index out of range
# ✅ Use valid indices
print(my_list[2]) # Last item
print(my_list[-1]) # Also gets last item
How to fix:
Remember Python uses 0-based indexing (first item is
[0])Check the length:
len(my_list)Use negative indices to count from the end:
[-1]is the last item
ValueError#
What it means: You passed the right type of argument, but an invalid value.
Common causes:
Wrong value in function argument
# ❌ 'median' is not a valid method for normality test
pg.normality(data['age'], method='median')
# ValueError: method must be 'normaltest' or 'jarque_bera'
# ✅ Use a valid method
pg.normality(data['age'], method='normaltest')
Incompatible array shapes
# ❌ Arrays have different lengths
x = [1, 2, 3]
y = [1, 2, 3, 4, 5]
plt.plot(x, y)
# ValueError: x and y must have same first dimension
# ✅ Make sure arrays match
x = [1, 2, 3, 4, 5]
y = [1, 2, 3, 4, 5]
plt.plot(x, y)
How to fix:
Read the error message carefully - it often tells you valid options
Check function documentation:
help(function_name)Verify data dimensions match when required
Troubleshooting Workflow#
When you encounter an error, follow these steps:
Read the error type - What kind of error is it?
Find the line - Look for the arrow
---->in the tracebackCheck the error message - What specific detail does it give?
Consult this guide - Find your error type above
Try the fixes - Work through the suggested solutions
Still stuck?
Print intermediate values:
print(variable)Check types:
type(variable)ordata.dtypesReview earlier cells - did they all run successfully?
Restart and run all:
Runtime→Restart and Run AllAsk for help in office hours or discussion board
Prevention Tips#
Run cells in order - Start from the top and work down
Check column names - Use
data.columnsfrequentlyUse tab completion - Start typing and press Tab to autocomplete
Match the examples - Follow the pattern from walkthroughs
Test as you go - Don’t write lots of code before running it
Read error messages - They often tell you exactly what’s wrong
Additional Resources#
Python Syntax Quick Reference - For syntax questions
Official pandas documentation: https://pandas.pydata.org/docs/