{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "47ab9d3d-5a84-4b1c-a5e0-03e14abf4524", "metadata": { "tags": [ "remove_cell" ] }, "outputs": [], "source": [ "# Setting up the Colab environment. DO NOT EDIT!\n", "import os\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")\n", "\n", "try:\n", " import otter\n", "\n", "except ImportError:\n", " ! pip install -q otter-grader==4.0.0\n", " import otter\n", "\n", "if not os.path.exists('walkthrough-tests'):\n", " zip_files = [f for f in os.listdir() if f.endswith('.zip')]\n", " assert len(zip_files)>0, 'Could not find any zip files!'\n", " assert len(zip_files)==1, 'Found multiple zip files!'\n", " ! unzip {zip_files[0]}\n", "\n", "grader = otter.Notebook(colab=True,\n", " tests_dir = 'walkthrough-tests')" ] }, { "cell_type": "markdown", "id": "739d080a-ad24-4687-bb6e-aab37952233a", "metadata": { "tags": [] }, "source": [ "# Walkthrough" ] }, { "cell_type": "markdown", "id": "e920c6ea-e620-4f58-ba02-15bf10430cce", "metadata": {}, "source": [ "## Learning Objectives\n", "At the end of this learning activity you will be able to:\n", "* Create categorical comparisons with countplots.\n", "* Create quantitative comparison plots with seaborn: stripplot, barplot, boxplot with Seaborn.\n", "* Create correlation style plots with with scatterplot and regplot.\n", "* Utilize `pd.melt` to plot wide data with seaborn.\n", "* Describe bootstapping and confidence intervals." ] }, { "cell_type": "markdown", "id": "3e3d5466", "metadata": { "id": "59f3483e-0a2a-4217-8185-c5557dda482e" }, "source": [ "This week we will continue our exploration of data from a cohort study participants of People Living with HIV (PLwH) here at Drexel.\n", "\n", "As we discussed in the introduction, this data collection effort was done to provide a resource for many projects across the fields of HIV, aging, inflammation, neurocognitive impairment, immune function, and unknowable future projects.\n", "In this walkthrough we will explore a collection of cytokines and chemokines measured by a Luminex panel of common biomarkers of inflammation.\n", "We use this data to look for correlations between cytokine biomarkers and demographic variables." ] }, { "cell_type": "markdown", "id": "dfb3c8b6-02bb-4b43-97f1-c53734e71b81", "metadata": {}, "source": [ "### Documentation\n", "\n", "I HIGHLY recommend exploring and utilizing the documentation for this tool:\n", "\n", "* Base: https://seaborn.pydata.org/\n", "* Gallery of great examples: https://seaborn.pydata.org/examples/index.html\n", "* Function documentation: (We will be using the _Function Interface_. https://seaborn.pydata.org/api.html#function-interface\n", "* Tutorial: https://seaborn.pydata.org/tutorial.html\n", "\n", "When I'm making figures I'll usually have the documentation open on another tab." ] }, { "cell_type": "markdown", "id": "e3294db5", "metadata": {}, "source": [ "---------------------------------------------" ] }, { "cell_type": "code", "execution_count": 2, "id": "02d710d4", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "\n", "# This is how we normally import seaborn\n", "import seaborn as sns\n", "\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 3, "id": "00ed4f31-4ec4-47fb-9d37-1885d9cae14b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Sex | \n", "Age | \n", "isAA | \n", "egf | \n", "eotaxin | \n", "fgfbasic | \n", "gcsf | \n", "gmcsf | \n", "hgf | \n", "ifnalpha | \n", "... | \n", "mig | \n", "mip1alpha | \n", "mip1beta | \n", "tnfalpha | \n", "vegf | \n", "cocaine_use | \n", "cannabinoid_use | \n", "neuro_screen_impairment_level | \n", "bmi | \n", "years_infected | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Male | \n", "53.0 | \n", "Checked | \n", "65.01 | \n", "170.20 | \n", "50.32 | \n", "117.14 | \n", "2.51 | \n", "481.37 | \n", "110.79 | \n", "... | \n", "185.29 | \n", "104.63 | \n", "151.15 | \n", "17.61 | \n", "7.54 | \n", "True | \n", "True | \n", "none | \n", "21 | \n", "18 | \n", "
1 | \n", "Female | \n", "62.0 | \n", "Checked | \n", "232.83 | \n", "118.23 | \n", "36.03 | \n", "215.38 | \n", "24.53 | \n", "988.71 | \n", "66.13 | \n", "... | \n", "397.24 | \n", "242.10 | \n", "230.87 | \n", "51.22 | \n", "31.60 | \n", "True | \n", "True | \n", "none | \n", "22 | \n", "16 | \n", "
2 | \n", "Male | \n", "60.0 | \n", "Checked | \n", "84.84 | \n", "55.27 | \n", "13.22 | \n", "14.08 | \n", "0.48 | \n", "364.31 | \n", "78.67 | \n", "... | \n", "18.63 | \n", "34.85 | \n", "68.34 | \n", "2.48 | \n", "0.84 | \n", "False | \n", "False | \n", "none | \n", "25 | \n", "16 | \n", "
3 | \n", "Male | \n", "62.0 | \n", "Checked | \n", "24.13 | \n", "70.18 | \n", "4.12 | \n", "14.08 | \n", "1.33 | \n", "510.36 | \n", "118.64 | \n", "... | \n", "118.63 | \n", "113.30 | \n", "49.15 | \n", "10.93 | \n", "3.53 | \n", "True | \n", "True | \n", "impaired | \n", "29 | \n", "21 | \n", "
4 | \n", "Male | \n", "54.0 | \n", "Checked | \n", "186.98 | \n", "69.18 | \n", "32.56 | \n", "184.74 | \n", "12.55 | \n", "395.87 | \n", "40.79 | \n", "... | \n", "140.56 | \n", "131.83 | \n", "241.00 | \n", "32.01 | \n", "10.81 | \n", "True | \n", "True | \n", "none | \n", "26 | \n", "16 | \n", "
5 rows × 37 columns
\n", "\n", " | Sex | \n", "Age | \n", "isAA | \n", "egf | \n", "eotaxin | \n", "fgfbasic | \n", "gcsf | \n", "gmcsf | \n", "hgf | \n", "ifnalpha | \n", "... | \n", "mip1beta | \n", "tnfalpha | \n", "vegf | \n", "cocaine_use | \n", "cannabinoid_use | \n", "neuro_screen_impairment_level | \n", "bmi | \n", "years_infected | \n", "multi_use | \n", "neuro_screen_ordinal | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Male | \n", "53.0 | \n", "Checked | \n", "65.01 | \n", "170.20 | \n", "50.32 | \n", "117.14 | \n", "2.51 | \n", "481.37 | \n", "110.79 | \n", "... | \n", "151.15 | \n", "17.61 | \n", "7.54 | \n", "True | \n", "True | \n", "none | \n", "21 | \n", "18 | \n", "YY | \n", "none | \n", "
1 | \n", "Female | \n", "62.0 | \n", "Checked | \n", "232.83 | \n", "118.23 | \n", "36.03 | \n", "215.38 | \n", "24.53 | \n", "988.71 | \n", "66.13 | \n", "... | \n", "230.87 | \n", "51.22 | \n", "31.60 | \n", "True | \n", "True | \n", "none | \n", "22 | \n", "16 | \n", "YY | \n", "none | \n", "
2 | \n", "Male | \n", "60.0 | \n", "Checked | \n", "84.84 | \n", "55.27 | \n", "13.22 | \n", "14.08 | \n", "0.48 | \n", "364.31 | \n", "78.67 | \n", "... | \n", "68.34 | \n", "2.48 | \n", "0.84 | \n", "False | \n", "False | \n", "none | \n", "25 | \n", "16 | \n", "NN | \n", "none | \n", "
3 | \n", "Male | \n", "62.0 | \n", "Checked | \n", "24.13 | \n", "70.18 | \n", "4.12 | \n", "14.08 | \n", "1.33 | \n", "510.36 | \n", "118.64 | \n", "... | \n", "49.15 | \n", "10.93 | \n", "3.53 | \n", "True | \n", "True | \n", "impaired | \n", "29 | \n", "21 | \n", "YY | \n", "impaired | \n", "
4 | \n", "Male | \n", "54.0 | \n", "Checked | \n", "186.98 | \n", "69.18 | \n", "32.56 | \n", "184.74 | \n", "12.55 | \n", "395.87 | \n", "40.79 | \n", "... | \n", "241.00 | \n", "32.01 | \n", "10.81 | \n", "True | \n", "True | \n", "none | \n", "26 | \n", "16 | \n", "YY | \n", "none | \n", "
5 rows × 39 columns
\n", "\n", " | Sex | \n", "Age | \n", "neuro_screen_ordinal | \n", "multi_use | \n", "cytokine | \n", "concentration | \n", "
---|---|---|---|---|---|---|
0 | \n", "Male | \n", "53.0 | \n", "none | \n", "YY | \n", "mcp1 | \n", "468.72 | \n", "
1 | \n", "Female | \n", "62.0 | \n", "none | \n", "YY | \n", "mcp1 | \n", "591.70 | \n", "
2 | \n", "Male | \n", "60.0 | \n", "none | \n", "NN | \n", "mcp1 | \n", "132.80 | \n", "
3 | \n", "Male | \n", "62.0 | \n", "impaired | \n", "YY | \n", "mcp1 | \n", "816.71 | \n", "
4 | \n", "Male | \n", "54.0 | \n", "none | \n", "YY | \n", "mcp1 | \n", "414.97 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
667 | \n", "Male | \n", "44.0 | \n", "none | \n", "NN | \n", "il6 | \n", "16.69 | \n", "
668 | \n", "Male | \n", "59.0 | \n", "mild | \n", "YY | \n", "il6 | \n", "15.55 | \n", "
669 | \n", "Male | \n", "63.0 | \n", "mild | \n", "YY | \n", "il6 | \n", "11.94 | \n", "
670 | \n", "Male | \n", "41.0 | \n", "mild | \n", "NY | \n", "il6 | \n", "12.48 | \n", "
671 | \n", "Male | \n", "43.0 | \n", "mild | \n", "NN | \n", "il6 | \n", "2.40 | \n", "
672 rows × 6 columns
\n", "