Lab#

Learning Objectives#

At the end of this learning activity you will be able to:

  • Practice creating statistical figures to answer biological questions.

  • Practice writing figure legends for statistical figures.

  • Practice writing descriptive reasonings about a figure.

Note: It is difficult to automatically grade figures as they are many “correct” answers. So, most questions will accept any figure or axis and then ask you to answer a question that should be obvious from a properly generated figure. For all questions, assume a 95% confidence interval.

Use this lab as an opportunity to explore the different plot-types that Seaborn can make. Check out https://seaborn.pydata.org/examples/index.html for ideas.


import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
data = pd.read_csv('cytokine_data.csv')
data.head()

Explore the effect of cocaine use on mcp1#

Q1: Do cocaine users have a higher level of expression of mcp1?#

Generate a plot which displays the spread of mcp1 measurements for each user, split by cocaine use. This can be boxplot, stripplot, hexplot, violin plot, or a number of others.

Use a markdown cell to write a figure legend. At a minimum:

  • Single declaritive sentence stating the conclusion of the figure.

  • Each axis must be described.

  • Every color, line, and dot must be described.

  • Error bars must be described.

Then answer:

Do cocaine users or non-users have a higher levels of mcp1? Use a markdown cell to justify your answer using your figure.

Checked variables:

  • q1_ax - A matplotlib Axes object with a plot showing mcp1 distribution by cocaine use

    • Should be a boxplot, violin plot, or similar showing spread of data

  • Markdown cell with figure caption and answer

Hint Use seaborn's boxplot or violinplot with x='cocaine_use' and y='mcp1'. Create a figure caption describing what you see. Compare the distributions between cocaine users and non-users. See Module 6 walkthrough for data visualization examples.

Points

5

Public Checks

2

Hidden Tests

1

Points: 5

# Make your plot here

q1_plot = ...

# Create a markdown cell after this to write a figure legend.
# Do cocaine users or non-users have a higher levels of mcp1?
# Answer 'users' or 'non-users', 'same'
q1_higher_level = ...

# Use the cell below to write your justification given the figure you presented.
grader.check("q1_cocaine_use_spread")

Q2: Do cocaine users or non-users have a higher average level of mcp1?#

Generate a plot which displays the confidence of the mean of mcp1 expression across cocaine use.

Then, write a figure legend that at a minimum contains:

  • Single declaritive sentence stating the conclusion of the figure.

  • Each axis must be described.

  • Every color, line, and dot must be described.

  • Error bars must be described.

Then, use that figure to answer whether cocaine users or non-users have a higher average level of mcp1? Include a markdown cell that justifies your answer given the figure you presented.

Checked variables:

  • q2_ax - A matplotlib Axes object showing mean mcp1 with confidence intervals by cocaine use

    • Should show error bars (e.g., using barplot with error bars or pointplot)

  • Markdown cell with figure caption and answer

Hint Use seaborn's barplot (shows mean + CI by default) or pointplot with x='cocaine_use' and y='mcp1'. Write a caption explaining the means and confidence intervals. Determine which group has higher average. See Module 6 walkthrough for examples with confidence intervals.

Points

5

Public Checks

2

Hidden Tests

1

Points: 5

# Generate a plot which displays the confidence of the mean of mcp1 expression across cocaine use

q2_plot = ...
# Do cocaine users or non-users have a higher average level of mcp1
# Answer 'users' or 'non-users', 'same'
q2_higher_mean = ...

# Make a cell below this to explain your reasoning based on the figure.
grader.check("q2_cocaine_use_mean")

Q3: Does Sex impact the effect of cocaine use on the average level of mcp1 expression?#

Generate a plot which displays the confidence of the mean of mcp1 expression across cocaine use and split by sex.

Then use a markdown cell to write a figure caption that, at a minimum includes:

  • A single declaritive sentence stating the conclusion of the figure.

  • Each axis must be described.

  • Every color, line, and dot must be described.

  • Error bars must be described.

Then answer the question: Does sex modulate the impact of cocaine use on mcp1 expression? Create a markdown cell afterwards that describes your answer based on the figure you created.

Checked variables:

  • q3_ax - A matplotlib Axes object showing mean mcp1 by cocaine use, split by sex

    • Should show confidence intervals

    • Should use color or grouping to separate by sex

  • Markdown cell with figure caption and answer

Hint Use seaborn's barplot or pointplot with x='cocaine_use', y='mcp1', hue='sex'. This will show if the cocaine effect differs between males and females. Write a caption explaining the interaction. See Module 6 walkthrough for grouped plots.

Points

5

Public Checks

2

Hidden Tests

1

Points: 5

# Generate a plot which displays the confidence of the mean of mcp1 expression across cocaine use

q3_plot = ...
# Is it 'likely' or 'unlikely' that gender has an impact on the effect of cocaine use on mcp1?
q3_gender_impact = ...
grader.check("q3_cocaine_use_gender_mean")

Q4: Is there a correlation between infection length and mcp1 expression?#

Generate a plot which displays the relationship between years infected and mcp1 expression.

Then use a markdown cell to write a figure caption that, at a minimum includes:

  • A single declaritive sentence stating the conclusion of the figure.

  • Each axis must be described.

  • Every color, line, and dot must be described.

  • Error bars must be described.

Then answer the question: Is there a correlation between infection length and mcp1 expression? Create a markdown cell afterwards that describes your answer based on the figure you created.

Checked variables:

  • q4_ax - A matplotlib Axes object with a scatterplot showing years_infected vs mcp1

    • Should show the relationship between continuous variables

    • Consider adding a regression line

  • Markdown cell with figure caption and answer

Hint Use seaborn's regplot or scatterplot with x='years_infected' and y='mcp1'. A regression line helps visualize correlation. Write a caption describing the relationship. See Module 6 walkthrough for correlation plots.

Points

5

Public Checks

2

Hidden Tests

1

Points: 5

# Generate a plot which displays the relationship between years_infected and mcp1 expression


q4_plot = ...
# Is there a correlation between infection length and mcp1 expression? 'yes' or 'no'
q4_infection_length_corr = ...
grader.check("q4_infection_length")

Q5: Does cocaine use impact the relationship between infection length and mcp1 expression?#

Generate a plot which displays the impact of cocaine use relationship between years infected and mcp1 expression.

Then use a markdown cell to write a figure caption that, at a minimum includes:

  • A single declaritive sentence stating the conclusion of the figure.

  • Each axis must be described.

  • Every color, line, and dot must be described.

  • Error bars must be described.

Then answer the question: Does cocaine use impact the relationship between infection length and mcp1 expression Create a markdown cell afterwards that describes your answer based on the figure you created.

Checked variables:

  • q5_ax - A matplotlib Axes object showing years_infected vs mcp1, split by cocaine use

    • Should use color or separate lines to show different groups

    • Consider showing regression lines for each group

  • Markdown cell with figure caption and answer

Hint Use seaborn's lmplot or regplot with hue='cocaine_use' to show separate regression lines for each group. Compare the slopes and patterns. Write a caption explaining if cocaine use changes the relationship. See Module 6 walkthrough for grouped regression plots.

Points

5

Public Checks

2

Hidden Tests

1

# Generate a plot which displays the confidence of the mean of mcp1 expression across years infected and split by cocaine use

q5_plot = ...
# Does cocaine use impact the rate of mcp1 increase with infection length? 'yes' or 'no'
q5_infection_length_cocaine_slope = ...
grader.check("q5_infection_length_cocaine")

Submission#

Check:

  • That all tables and graphs are rendered properly.

  • Code completes without errors by using Restart & Run All.

  • All checks pass.

Then save the notebook and the File -> Download -> Download .ipynb. Upload this file to BBLearn.