Lecture Zero: Introduction
- 50% homework (5 assignments, 10% per homework released on GitHub and submitted on Gradescope)
- 40% final project
- 10% participation (show up to class or let course staff know you can’t make it)
Class is curved.
Post questions on Piazza (but not for debugging, but for conceptual questions on the homeworks or lecture clarifications) or some to OH:
- Peyton/Davis: Wednesday 4pm-6pm EST
- Armaan/Campbell: Sunday 4pm-6pm EST
What is DevOps?
Breaking down the wall between developers (people writing code) and operations (people releasing and deploying code into production and making sure it is reliable). Traditionally these have been two very separate teams, which means that the incentives developers and operations engineers don’t always align. Developers aren’t motivated to make life easier for operations and operations isn’t motivated to make life easier for developers. When a crash happens in production, the people handling the crash aren’t the ones familiar with the code.
The key concept behind DevOps is that if these two teams can share responsibilities, they can build empathy, align their incentives, and ultimately lead to a better experience for the end user if new features are more stable and reliable.
There are a few main DevOps solutions we will be focusing on in CIS 188:
- Automated testing and deployment (we can easily ship new features with testing)
- Easy deploy rollback (if something breaks we can revert quickly)
- Observability (so we can know when something is wrong)
The main takeaway is to get developers involved in the operations process so that developers can use their skills to build tools to automate away the tedious parts of operations jobs. DevOps is not a role, but a way of doing things.
We’ll be using Python for most of the development side of the DevOps solutions we cover in this course. It’s common and well supported in the infrastructure space because it’s easy to learn and there is wide library support.
# Comments start with a `#` import time # import a module from the standard library for i in range(1, 16): # For loops only over iterators like lists and `range()` print(i) if i % 3 == 0 and i % 5 == 0: # Conditional expressions print(time.time()) print("fizzbuzz") # Strings can be double or single quoted elif i % 3 == 0: print("fizz") elif i % 5 == 0: print("buzz")
Running the above code produces the following output:
$ python3 test.py 1 2 3 fizz 4 5 buzz 6 fizz 7 8 9 fizz 10 buzz 11 12 fizz 13 14 15 1607229328.9530184 fizzbuzz
Code in Python can run at the top-level, but it’s good practice to pull logic into functions:
def get_buzz(i): # (def)ine a function if i % 3 == 0 and i % 5 == 0: return "fizzbuzz" elif i % 3 == 0: return "fizz" elif i % 5 == 0: return "buzz" return "" for i in range(1, 16): print(str(i) + " " + get_buzz(i)) # Call a function
You can check out CIS 192 for more learning materials, and come to office hours with any questions about Python!
Writing code is useful, reusing code is even more useful! Making sure that you has access to the right packages and are also using the correct version of that package is no easy task. Python’s default package manager,
Most importantly, Poetry helps us create reproducable build environments wherever we run our code: on our local machines, on our friends’s machines, or even on a production server somewher in “the cloud.”
How it works
Poetry creates and manages two files.
pyproject.toml is Poetry’s dependency file: a human-readable (and writeable) file which declares “acceptable versions” of packages, generally a range, such as “1.1 - 1.12”, if version 2.0 contains a breaking change.
poetry.lock is the lock file: an autogenerated file used to declare specific package versions, including dependencies of dependencies. Poetry uses the lock file to save and persist its resolution of conflicts that it resolves from the list in
One reoccurring design pattern we see in DevOps is package managers. This is a tool that helps manage your program’s dependencies. In other words, the package manager is in charge of keeping track of what packages your project needs to run correctly, and then downloading those packages in a way that makes it easy for your program to use this auxillary code.
We’ll look at a few different package managers over the course of the semester. Node has one called NPM (Node Package Manager), Java has a package manager called Maven, and Python has a few offerings. Note that these package managers are all a little different because they work with different languages that all have different nuances. This is why we can’t reuse package managers across languages.
The Python package manager we’ll be using is called Poetry. Essentially, Poetry allows you to download certain Python libraries, then it creates a virtual python environment on your machine to run your code with the given libraries. So, why the virtual environment? The answer is that Python varies a lot from version to version (especially Python 2 compared to Python 3). The virtual environment ensures that you, your team of developers, and your production environment are all on the same version of Python. This way we can avoid any issues and bugs that may arise from code that’s written to work on one version of Python actually being run with a different version of Python.
Now, let’s get into how to actually use Poetry. First, make sure that you have Poetry installed on your machine, instructions for installation can be found here.
Once you have Poetry installed, let’s create a new project:
# Create a new folder called poetry_demo $ mkdir poetry_demo # Enter the new folder $ cd poetry_demo $ poetry init
Now Poetry will give you lots of options for how to initialize your project, just hit enter for all of them (Poetry will use the default setup which is fine for our purposes). Once you’ve finished, you’ll see that there is a new file
pyproject.toml in the directory, this is the file that stores the information we just initialized.
Next, let’s add a dependency:
$ poetry add numpy Creating virtualenv poetry-demo-KkU142w6-py3.9 in /Users/airbenderang/Library/Caches/pypoetry/virtualenvs Using version ^1.19.5 for numpy Updating dependencies Resolving dependencies... (39.8s) Writing lock file Package operations: 1 install, 0 updates, 0 removals • Installing numpy (1.19.5)
Now, Poetry actually does two things here. It downloads NumPy, but before that it actually creates a virtual environment which we are going to use to run our Python code. If we wanted to use a deprecated version of Python (like Python 2) we could configure Poetry to setup the virtual environment so it runs an older release of Python. Again, you will see a new file in your directory, this is the
poetry.lock file. It doesn’t make much sense to humans, but the
poetry.lock file tracks which packages your program depends on and the version number of those packages.
Finally, let’s run some code on Poetry’s virtual environment. There are two ways that you will run python programs with Poetry. The first is you can type
poetry run script.py and this would run a Python script in the Poetry environment, but instead we will be opening a new shell that will have the Poetry virtual environment as our default Python environment:
$ poetry shell Spawning shell within /Users/airbenderang/Library/Caches/pypoetry/virtualenvs/poetry-demo-KkU142w6-py3.9 $ . /Users/airbenderang/Library/Caches/pypoetry/virtualenvs/poetry-demo-KkU142w6-py3.9/bin/activate # Open a new Python interactive terminal $ python Python 3.9.1 (default, Dec 24 2020, 16:53:18) [Clang 12.0.0 (clang-1188.8.131.52)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> x = np.array([[1,2],[3,4]]) >>> x array([[1, 2], [3, 4]]) >>> y = np.linalg.inv(x) >>> y array([[-2. , 1. ], [ 1.5, -0.5]]) >>> exit() # To leave the Python terminal # Then exit again to leave the Poetry shell $ exit
It looks like NumPy works! This means that Poetry has been properly able to manage our dependencies so that they are accessible when we run our Python code with Poetry. Now, let’s make a simple Python file and have Poetry run it. Create a new file called
average.py in the same directory as your
poetry.lock and paste this code into it:
import sys import numpy as np if len(sys.argv) < 2: print("Not enough command line arguments") exit() xs =  try: for i in range(1, len(sys.argv)): xs.append(int(sys.argv[i])) except: print("Command line arguments are not integers") exit() print(np.average(np.asarray(xs)))
Now, we can run this in the virtual environment created by Poetry:
$ poetry run python average.py 1 2 3 4 2.5
Awesome, it looks like this is working, too. Try changing around the command line arguments!