Work in progress!

Walkthrough#

Remember, all assignments are due before the weekly synchronous session.

Learning Objectives#

At the end of this learning activity you will be able to:

  • Use basic arithmetic operations in Python.

  • Summarize the basic expression syntax in Python.

  • Write an equation that uses the result of one variable to calculate the value of another.

  • Create basic f-strings in Python to display dynamically created data.

  • Summarize a general strategy for using Python to calculate dilutions.

Programmatic Arithmetic in Python#

Often times in the lab we have common tasks that we repeat over and over again. This can be anything from counting the number of cells on a plate, to normalizing values with a reference, to calculating dilutions for stock chemicals. Automating these types of tasks can lead to drastic speedups in the time it takes to get common tasks done. This week we’ll use a common problem from molecular biology as our jumping off point into Python.

Recently, my lab obtained a Nanopore MinION. It is a 1000 dollar, USB-key sized DNA sequencer that reads millions of bases for about 100 dollars per sample. As part of a Senior Design Project we used the device to track the COVID outbreak in the Drexel community using rapid sequencing. Watch the video explaining the project in the Recommended Materials for more context. This protocol requires numerous tedious calculations relating mass, moles, and concentrations. This week we will explore how to use Python to automate these calculations.

The Nanopore sequencing protocol requires the operator to perform 3 enzymatic reactions:

  1. End-Prep: Prepare the 3’ and 5’ ends of the DNA by removing single-basepair overhangs and add a single A at the end of the molecule.

  2. Barcode ligation: Attach unique barcodes to each sample using a T overhang so each sample has an individual key at the start of the sequence.

  3. Adapter ligation: After pooling each sample, another DNA molecule (called an adapter) needs to be added so it can attach to the motor protein inside the Nanopore device.

Refer to the online textbook for more detail.

The Problem#

Just like baking, when performing enzymatic reactions it is critical that we use the right amount of each ingredient. The Nanopore enzymatic reagents come in prescribed amounts and it is up to the operator to ensure that the correct initial amount of template DNA is added to each reaction.

The amount of template DNA needed for each reaction is listed in the protocol in moles. Moles are a unit of “amount” such as the number of molecules of DNA, there are 6.022 × 10^(23) items in a mole. However, we can’t count the amount of DNA we have in a test-tube. But, we can weight the DNA by looking at the amount of light absorbed by the sample using a device called a Qubit. Then, if we know the number of nucleotides in the strand, we can convert the weight of the DNA into a number of moles. Refer to the course book for a in-depth review of math.

Doing this calculation manually is tedious and prone to error. The perfect thing to automate.

Walkthrough#

We do this through a series of expressions. Remember, the computer is not ‘space limited’ we should write code so WE understand it. Not, try to make everything as short and compact as possible.

Assume you have a 25 ul of a 280 bp double-stranded template at that you measured to be a concentration of 50.6 ng/ul.

# It is often useful to define all of your variables at the beginning.
amplicon_length = 280 # bp
dna_weight = 650 # g/mole/bp
dna_conc = 50.6 # ng/ul
volume = 25 # ul

What is the template weight?#

template_weight = amplicon_length*dna_weight
print(f'The template weighs {template_weight} g/mole')
The template weighs 182000 g/mole

Q1: Calculate the molarity of the sample#

# Answer in fmoles/ul

dna_molarity = dna_conc * 1E-9 / template_weight / 1E-15 # SOLUTION
print(f'The DNA molarity is {dna_molarity} fmoles/ul')
The DNA molarity is 278.02197802197804 fmoles/ul
print('Is dna_molarity a float:', isinstance(dna_molarity, float))
Is dna_molarity a float: True
print(f'dna_molarity = {dna_molarity:0.1f}')
dna_molarity = 278.0

Some things to notice above:

  1. There’s an f immediately before the '. This makes it a “formatted” string. Or f-string.

  2. There’s a lot of different colors changing.

f-strings#

These are a new (circa 2016) addition to Python that makes adding data into strings. Representing our results as dynamically changing explanatory statements helps make our analysis more transparent and reproducible. f-strings make this much easier.

Take a look at this post from The Python Guru for an indepth explanation of the formatting.

Linting through color#

If we look around our notebook, we can see that there are a lot of different text colors. Those are hints at what Python thinks we’re trying to tell it. Understanding the code can really help with debugging.

Numbers are green.

1231231

Variables are black.

val = 1231231
other = val

Strings are orange.

val = '1231231'

Even if they are strings of numbers.

f-strings are orange.

val = f'1231231'

f-strings are orange, unless it is between { }.

age = 12
val = f'This book is {age} years old.'

The parts between curly braces are replaced by the value in the code.

Notice how imbalanced braces alters the color.

age = 12
val = f'This book is {age years old.'

Q2: Calculate the amount of sample to add.#

The protocol requires us to start with 200 fmoles of template DNA. How many mircoliters of our stock do we need to start with?

# Answer in ul

wanted_dna = 200 # fmoles

volume_to_add = wanted_dna / dna_molarity # SOLUTION

print(f'You should add {volume_to_add:0.2f} ul of sample to your reaction.')
You should add 0.72 ul of sample to your reaction.
print('Is volume_to_add a float:', isinstance(volume_to_add, float))
Is volume_to_add a float: True
print(f'volume_to_add = {volume_to_add:0.2f}')
volume_to_add = 0.72

Q3: Describing the reaction yield#

Calculating how much total amount of DNA we created during the PCR is called the yield of the reaction.

Create an f-string that renders the yield in femtomoles of this reaction. Round your answer to the nearest integer.

# Calculate the amount of DNA in the entire reaction
# Answer in fmoles
dna_yield = dna_molarity*volume # SOLUTION

# Create an f-string that uses the dna_yield variable
# and describes the result in a short sentence
dna_yield_description = f'The experiment yielded {dna_yield:0.0f} fmoles of DNA.' # SOLUTION

print(dna_yield_description)
The experiment yielded 6951 fmoles of DNA.
print('Is dna_yield_description a str:', isinstance(dna_yield_description, str))
Is dna_yield_description a str: True
print('Is the correct number in the description:', '6951' in dna_yield_description)
Is the correct number in the description: True

Functions#

Functions are self contained blocks of code created for a reusable purpose.

Purpose:

  • Modularity: Breaks down complex processes into smaller, manageable parts.

  • Reusability: Allows the same code to be used multiple times without repetition.

  • Organization: Makes the code more organized and easier to understand.

def function_name(arg1, arg2, kwarg1=1, kwarg2='a'):
    "A brief function description"

    # do something with inputs
    result = arg1 + 2*arg2

    return result

Instead of continually copy-paste-and-change, we should write a function.

We’ve been using something like this to calculate the molarity from the concentration.

dna_molarity = dna_conc * 1E-9 / template_weight / 1E-15 

def calc_molarity(sample_concentration, sample_length, base_weight=650):
    """Calculate molarity of samples.

    sample_concentration : ng/ul
    sample_length : bases
    base_weight : g/mole/bp

    returns molarity fmols/ul
    """

    nano = 1E-9
    fempto = 1E-15

    amplicon_weight = sample_length*base_weight
    molarity = sample_concentration * nano / amplicon_weight / fempto

    return molarity

Once created, we can use this function anywhere.

paragon_molarity = calc_molarity(50.6, 280)
print(f'Function calculated paragon molarity {paragon_molarity:0.1f} fmols/ul')
Function calculated paragon molarity 278.0 fmols/ul

Now, if we had another sample with a different concentration.

new_concentration = 150.6 # ng/ul

new_paragon_molarity = calc_molarity(new_concentration, 280)
print(f'Function calculated new molarity {new_paragon_molarity:0.1f} fmols/ul')
Function calculated new molarity 827.5 fmols/ul

Or, if for some reason you were making RNA, the base_weight would be different.

rna_paragon_molarity = calc_molarity(new_concentration, 280, base_weight=320)
print(f'Function calculated rna molarity {rna_paragon_molarity:0.1f} fmols/ul')
Function calculated rna molarity 1680.8 fmols/ul

Q4: Write a function which calculates the reaction yield#

Use the function above as a template to create on that further calculates the reaction yield in fmols.

def calc_yield(sample_concentration, sample_length, sample_volume, base_weight=650):
    """Calculate molarity of samples.

    sample_concentration : ng/ul
    sample_length : bases
    base_weight : g/mole/bp

    returns sample_yield in fmols
    """
    # BEGIN SOLUTION NO PROMPT

    molarity = calc_molarity(sample_concentration, sample_length, base_weight=base_weight)
    sample_yield = molarity*sample_volume

    return sample_yield

    # END SOLUTION
current_yield = calc_yield(50.6, 280, 25)
print(f'Current reaction yield is {current_yield:0.1f} fmols')
Current reaction yield is 6950.5 fmols
print(f'Testing calc_yield(50.6, 280, 25) = {calc_yield(50.6, 280, 25):0.1f}')
Testing calc_yield(50.6, 280, 25) = 6950.5
print(f'Testing calc_yield(35, 263, 20, base_weight=320) = {calc_yield(35, 77, 19, base_weight=320):0.1f}')
Testing calc_yield(35, 263, 20, base_weight=320) = 26988.6

Conclusion#

In this walkthrough we have discussed a number of ways to perform basic math in Python. We also covered strategies to modularize processes into reusable functions. This week we worked with a ‘one number at a time’ strategy, in the next module we will explore using tables to work with multiple samples at the same time.