Welcome Spring 2025 Students!

Lab#

Remember, all assignments are due at 11:59 PM (Philadelphia time) on the Sunday of each instructional week.

Learning Objectives#

At the end of this learning activity you will be able to:

  • Employ pg.chi2_independence to estimate the correlation between two categorical variables.

  • Practice testing variables for normality.

  • Employ pg.ttest, pg.anova, and pg.kruskal to look for differences in a dependent variable between different categorical variables.

Introduction#

In this lab you will explore the effects of antiretroviral medications on neurological impairment. In this cohort, we have two major drug regimens, d4T (Stavudine) and the newer Emtricitabine/tenofovir (Truvada). The older Stavudine is suspected to have neurotoxic effects that are not found in the newer Truvada.

In order to evaluate this effect, the participants in this cohort have completed an extensive neuropsychological exam that measures each 6 domains of neurocognition:

  • Processing Speed

  • Executive Function

  • Language

  • Visuospatial processing

  • Learning and Memory

  • Motor Function

Each of these domains is measure by a number of tests. The results of these tests are then compared to demographically matched individuals (age, race, gender, and education) in order to scale the values appropriately.

These values are on a Z-scale. A z-scale is a tranformation such that the mean is 0 and the standard deviation is 1. Therefore a person with a motor_domain_z = 0 are performing at the average of matched individuals. A person with motor_domain_z = -1 is performing 1 standard deviation below the average of matched individuals.

This leads to a scale of:

  • Z < -2 : Significant impairment

  • -2 < Z < -1 : Mild impairment

  • Z > -1 : No evidence of impairment

import numpy as np
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

import pingouin as pg

%matplotlib inline
data = pd.read_csv('hiv_neuro_data.csv')
data['education'] = data['education'].astype(float)
data.head()

Q1: How many participants are suffering from impairment?#

Using the thresholds above, create a bar-chart which shows the number of individuals with mild or moderate impairment for each of the domains.

# Generate a figure
q1_plot = ...
# Which column has the most impaired individuals in this cohort?

q1_most_impaired = ...
grader.check("q1_impaired_bar")

Q2: Is Visuospatial impairment linked with ART therapy?#

Using the thresholds above, binarize indivduals based on their visuospatial_domain_z into impaired and non-impaired. Then, use a chi2 test to measure the linkage between this and the ART therapy of the individual.

# Create a countplot which visualizes this comparison


# Generate a figure showing this comparison
q2_plot = ...
# Perform a chi2 test
# Is there a linkage between Visuospatial impairment and ART regimen? 'yes' or 'no'
q2_linkage = ...

# Which therapy is leading to more impairment? 'Stavudine' or 'Truvada'
q2_therapy = 'Stavudine'
grader.check("q2_impaired_v_art")

Q3: Is Visuospatial score linked with ART therapy?#

Evaluate the normality of the visuospatial_domain_z and then choose the appropriate test between the two therapies.

Refer to the pingouin guidelines: https://pingouin-stats.org/build/html/guidelines.html

# Asses the normality of the visuospatial_domain_z scale


# Answer yes or no.
q3_is_normal = ...
# Generate a figure showing this comparison
q3_plot = ...
# Using the appropriate test, determine whether there is a difference between ART regimens
# Is visuospatial_domain_z significantly different between ART regimens? 'yes' or 'no'
q3_sig_diff = ...
grader.check("q3_visuo_v_art")

Q4: Evaluate a potential covariate#

ART use is likely not the only thing that impacts neurocognitive impairement. Use similar methods to evaluate the impact of any of:

  • sex

  • race

  • education

  • age

  • YearsSeropositivedata[‘YS_binned’]

on visuospatial_domain_z. You can use any comparison method we have discussed so far.

Points: 5

# Generate a figure of your comparison
q4_plot = ...
# Choose the appropriate test for your comparison
...
# Is there a linkage between Visuospatial domain and your covariate? 'yes' or 'no'
q4_is_sig = ...
grader.check("q4_covariates")

In this lab you explored the linkage between ART regimens and visuospatial memory domain. We utilized tools like chi2 tests and various means-tests to determine whether categorical varaibles were assotiated with impairement. Next week we will utilize single and multiple regressions to compare continous varaibles to gain more statistical power.


Submission#

Check:

  • That all tables and graphs are rendered properly.

  • Code completes without errors by using Restart & Run All.

  • All checks pass.

Then save the notebook and the File -> Download -> Download .ipynb. Upload this file to BBLearn.