Python Types Reference Guide#
This guide explains the common data types you’ll encounter in this course and what you can do with them.
How to Check a Type#
# Check the type of any variable
type(my_variable)
# Example
x = 5
print(type(x)) # Output: <class 'int'>
# Check if something is a specific type
isinstance(x, int) # Returns True or False
Python Built-in Types#
Numbers#
int (Integer)#
What it is: Whole numbers (no decimal point)
Examples:
age = 25
count = 100
negative = -5
Common operations:
x = 10
y = 3
# Arithmetic
x + y # Addition: 13
x - y # Subtraction: 7
x * y # Multiplication: 30
x / y # Division: 3.333...
x // y # Integer division: 3
x % y # Modulo (remainder): 1
x ** y # Power: 1000
# Comparison
x > y # Greater than: True
x == y # Equal to: False
float (Floating Point Number)#
What it is: Numbers with decimal points
Examples:
height = 1.75
temperature = 98.6
pi = 3.14159
Common operations:
# Same as int, plus:
round(3.14159, 2) # Round to 2 decimals: 3.14
abs(-5.5) # Absolute value: 5.5
Converting between types:
int(3.9) # Converts to int: 3 (truncates, doesn't round)
float(5) # Converts to float: 5.0
int('42') # String to int: 42
float('3.14') # String to float: 3.14
Text#
str (String)#
What it is: Text data
Examples:
name = 'Alice'
message = "Hello, World!"
column_name = 'age'
Common operations:
text = 'Hello World'
# Concatenation
'Hello' + ' ' + 'World' # 'Hello World'
# Length
len(text) # 11
# Case conversion
text.lower() # 'hello world'
text.upper() # 'HELLO WORLD'
# Splitting
text.split() # ['Hello', 'World']
# Checking contents
'Hello' in text # True
text.startswith('H') # True
text.endswith('d') # True
# Replacing
text.replace('World', 'Python') # 'Hello Python'
# Stripping whitespace
' hello '.strip() # 'hello'
F-strings (formatted strings):
name = 'Alice'
age = 25
# Modern way (Python 3.6+)
message = f'{name} is {age} years old'
# Output: 'Alice is 25 years old'
# With formatting
pi = 3.14159
f'{pi:.2f}' # '3.14' (2 decimal places)
Collections#
list#
What it is: Ordered, mutable collection
Examples:
numbers = [1, 2, 3, 4, 5]
names = ['Alice', 'Bob', 'Charlie']
mixed = [1, 'two', 3.0, True]
Common operations:
my_list = [1, 2, 3]
# Accessing elements (0-indexed)
my_list[0] # First element: 1
my_list[-1] # Last element: 3
my_list[0:2] # Slice: [1, 2]
# Adding elements
my_list.append(4) # Add to end: [1, 2, 3, 4]
my_list.insert(0, 0) # Insert at position: [0, 1, 2, 3, 4]
my_list.extend([5, 6]) # Add multiple: [0, 1, 2, 3, 4, 5, 6]
# Removing elements
my_list.remove(0) # Remove first occurrence of value
my_list.pop() # Remove and return last element
my_list.pop(0) # Remove and return element at index
# Other operations
len(my_list) # Length
3 in my_list # Check if contains: True/False
my_list.sort() # Sort in place
sorted(my_list) # Return sorted copy
my_list.reverse() # Reverse in place
tuple#
What it is: Ordered, immutable collection (can’t be changed after creation)
Examples:
coordinates = (10, 20)
rgb = (255, 128, 0)
Common operations:
my_tuple = (1, 2, 3)
# Accessing (same as list)
my_tuple[0] # 1
len(my_tuple) # 3
# Cannot modify!
# my_tuple[0] = 5 # This will cause an error!
# Unpacking
x, y = (10, 20) # x=10, y=20
When to use: Function returns, dictionary keys, when you want to prevent modification
dict (Dictionary)#
What it is: Unordered collection of key-value pairs
Examples:
person = {'name': 'Alice', 'age': 25, 'city': 'Boston'}
settings = {'debug': True, 'timeout': 30}
Common operations:
my_dict = {'name': 'Alice', 'age': 25}
# Accessing values
my_dict['name'] # 'Alice'
my_dict.get('name') # 'Alice' (safer - won't error if key missing)
my_dict.get('city', 'Unknown') # 'Unknown' (default if not found)
# Adding/modifying
my_dict['city'] = 'Boston' # Add new key-value
my_dict['age'] = 26 # Update existing
# Checking
'name' in my_dict # Check if key exists: True
len(my_dict) # Number of key-value pairs
# Getting all keys/values
my_dict.keys() # dict_keys(['name', 'age', 'city'])
my_dict.values() # dict_values(['Alice', 26, 'Boston'])
my_dict.items() # key-value pairs
# Removing
my_dict.pop('city') # Remove and return value
del my_dict['age'] # Remove key-value pair
Iterating:
# Over keys
for key in my_dict:
print(key)
# Over values
for value in my_dict.values():
print(value)
# Over key-value pairs
for key, value in my_dict.items():
print(f'{key}: {value}')
set#
What it is: Unordered collection of unique values
Examples:
unique_numbers = {1, 2, 3, 4, 5}
categories = {'A', 'B', 'C'}
Common operations:
my_set = {1, 2, 3}
# Adding
my_set.add(4) # {1, 2, 3, 4}
# Removing
my_set.remove(2) # Removes 2, errors if not found
my_set.discard(2) # Removes 2, no error if not found
# Checking
3 in my_set # True
# Set operations
set1 = {1, 2, 3}
set2 = {3, 4, 5}
set1.union(set2) # {1, 2, 3, 4, 5}
set1.intersection(set2) # {3}
set1.difference(set2) # {1, 2}
When to use: Remove duplicates, test membership, mathematical set operations
bool (Boolean)#
What it is: True or False values
Examples:
is_valid = True
has_error = False
Common uses:
# In conditions
if is_valid:
print('Valid!')
# Comparisons return booleans
5 > 3 # True
'a' == 'b' # False
# Logical operations
True and False # False
True or False # True
not True # False
pandas Types#
pd.Series#
What it is: One-dimensional labeled array (like a column in Excel)
Examples:
import pandas as pd
# From a list
ages = pd.Series([25, 30, 35, 40])
# From a dictionary (keys become index)
grades = pd.Series({'Alice': 90, 'Bob': 85, 'Charlie': 92})
# Single column from DataFrame
age_column = data['age'] # This is a Series
Common operations:
series = pd.Series([1, 2, 3, 4, 5])
# Accessing elements
series[0] # First element
series.iloc[0] # First element by position
series.head() # First 5 elements
# Statistics
series.mean() # Average
series.median() # Median
series.std() # Standard deviation
series.min() # Minimum
series.max() # Maximum
series.sum() # Sum
series.describe() # Summary statistics
# Boolean operations
series > 3 # Returns Series of True/False
series[series > 3] # Filter: elements > 3
# Other operations
series.unique() # Unique values
series.value_counts() # Count of each unique value
series.isna() # Check for missing values
series.dropna() # Remove missing values
Key attributes:
series.values # Get underlying numpy array
series.index # Get index
series.shape # Shape (number of elements,)
series.dtype # Data type
pd.DataFrame#
What it is: Two-dimensional labeled data structure (like a spreadsheet)
Examples:
import pandas as pd
# From a dictionary
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'age': [25, 30, 35],
'city': ['Boston', 'NYC', 'LA']
})
# Reading data
df = pd.read_csv('data.csv')
Common operations:
# Viewing data
df.head() # First 5 rows
df.tail() # Last 5 rows
df.info() # Summary info
df.describe() # Statistical summary
df.shape # (rows, columns)
# Selecting columns
df['age'] # Single column (Series)
df[['age', 'name']] # Multiple columns (DataFrame)
# Selecting rows
df.loc[0] # Row by label
df.iloc[0] # Row by position
df.loc[0:5] # Rows 0 through 5 (inclusive)
df.iloc[0:5] # Rows 0 through 4 (exclusive)
# Filtering
df[df['age'] > 30] # Rows where age > 30
df.query('age > 30') # Same using query
df[(df['age'] > 30) & (df['city'] == 'NYC')] # Multiple conditions
# Adding columns
df['age_squared'] = df['age'] ** 2
# Grouping
df.groupby('city').mean() # Average by city
df.groupby('city')['age'].mean() # Average age by city
# Sorting
df.sort_values('age') # Sort by age
df.sort_values('age', ascending=False) # Descending
# Missing data
df.isna() # Check for missing values
df.dropna() # Remove rows with missing values
df.fillna(0) # Replace missing with 0
Key attributes:
df.columns # Column names
df.index # Row index
df.shape # (rows, columns)
df.dtypes # Data type of each column
df.values # Underlying numpy array
Checking type:
# Is it a Series?
isinstance(df['age'], pd.Series) # True
# Is it a DataFrame?
isinstance(df, pd.DataFrame) # True
matplotlib Types#
plt.Figure#
What it is: The entire window/page that holds your plot(s)
Creating:
import matplotlib.pyplot as plt
# Automatically created when you plot
plt.plot([1, 2, 3]) # Creates figure automatically
# Explicitly create
fig = plt.figure(figsize=(10, 6))
# Create figure with subplots
fig, ax = plt.subplots() # 1 plot
fig, axes = plt.subplots(2, 2) # 2x2 grid of plots
Common operations:
fig = plt.figure()
# Size
fig.set_size_inches(10, 6)
# Saving
fig.savefig('my_plot.png')
fig.savefig('my_plot.pdf', dpi=300)
# Showing
plt.show()
plt.Axes#
What it is: A single plot area within a figure (the actual plot)
Creating:
# Method 1: Automatically created
plt.plot([1, 2, 3]) # Creates both figure and axes
# Method 2: Explicit creation
fig, ax = plt.subplots()
# Method 3: From seaborn/pandas
ax = sns.boxplot(data=df, x='category', y='value')
ax = df['age'].plot()
Common operations:
fig, ax = plt.subplots()
# Plotting on axes
ax.plot([1, 2, 3], [1, 4, 9]) # Line plot
ax.scatter([1, 2, 3], [1, 4, 9]) # Scatter plot
ax.bar(['A', 'B', 'C'], [10, 20, 15]) # Bar plot
ax.hist([1, 2, 2, 3, 3, 3, 4]) # Histogram
# Labels and titles
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_title('My Plot')
# Limits
ax.set_xlim(0, 10)
ax.set_ylim(0, 100)
# Legend
ax.legend(['Series 1', 'Series 2'])
# Grid
ax.grid(True)
Checking type:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
isinstance(fig, plt.Figure) # True
isinstance(ax, plt.Axes) # True
# Many plotting functions return Axes
result = sns.boxplot(data=df, x='group', y='value')
isinstance(result, plt.Axes) # True
Key difference:
Figure = The whole canvas
Axes = The plot itself (yes, confusing name - it’s not just the axes!)
# One figure with multiple axes
fig, (ax1, ax2) = plt.subplots(1, 2) # 1 row, 2 columns
ax1.plot([1, 2, 3])
ax1.set_title('Plot 1')
ax2.plot([3, 2, 1])
ax2.set_title('Plot 2')
plt.show()
Quick Type Check Cheatsheet#
# Numbers
type(5) # <class 'int'>
type(5.0) # <class 'float'>
# Text
type('hello') # <class 'str'>
# Collections
type([1, 2, 3]) # <class 'list'>
type((1, 2, 3)) # <class 'tuple'>
type({1, 2, 3}) # <class 'set'>
type({'a': 1}) # <class 'dict'>
# Boolean
type(True) # <class 'bool'>
# pandas
type(df['age']) # <class 'pandas.core.series.Series'>
type(df) # <class 'pandas.core.frame.DataFrame'>
# matplotlib
type(fig) # <class 'matplotlib.figure.Figure'>
type(ax) # <class 'matplotlib.axes._axes.Axes'>
Common Type Confusions#
“Why is this a Series and not a DataFrame?”#
# Single bracket returns Series
df['age'] # Series
# Double bracket returns DataFrame
df[['age']] # DataFrame
“What’s the difference between a list and a Series?”#
List: Basic Python, no labels, fewer methods
Series: pandas, has index labels, many statistical methods
# List
my_list = [1, 2, 3]
my_list.mean() # ERROR! Lists don't have mean()
# Series
my_series = pd.Series([1, 2, 3])
my_series.mean() # 2.0 - Works!
“When do I get Figure vs Axes?”#
# Returns Axes only
ax = sns.boxplot(data=df, x='group', y='value')
ax = df.plot()
# Returns both Figure and Axes
fig, ax = plt.subplots()
# To get figure from axes
fig = ax.get_figure()