AerE 361 - Professor Nelson¶

Lecture 4 - Computers and Operating Systems - Part 2¶

No description has been provided for this image

Operating Systems¶

Computers only work because of the software running on them and the first bit of software that the computer will talk to will be...not the OS. The CPU needs a way to get "booted up". This would start with what was the BIOS, which has been replaced by UEFI. This handles getting the CPU started (bootstrapped) and most peripherals up and running.

But after that, we have the OS.

  • It is the translator between the user and the hardware
  • It acts as a manager for the hardware
    • Memory
    • Disk
    • CPU
    • Peripherals!

OS and the user¶

A key function of the OS is that it talks to the user (it's a people person!). The OS will take input by the user, send it to the CPU or other peripherals as needed, and then display any results. Of course today, Graphical User Interfaces (GUI) are the norm. But, command line used to be the norm, and as we discussed before, for HPC's still is.

No description has been provided for this image

An OS has many tasks and a few levels that it works with.

No description has been provided for this image

Linux on the desktop?¶

Linux is not seen on the desktop very often (but it is out there). But, if you count MacOS, which is based on BSD (which is based on Unix) then in a way yes, it is on the desktop as well.

No description has been provided for this image

But wait, there's Moore¶

If we look at just desktop machines, Windows has a clear lead here. But many of you are using Linux (or a close cousin) without realizing it. Virtually all mobile devices run Linux (Android) or a close cousin (BSD) in the case of iOS. If you take that into consideration, Linux has a clear lead, but again, most people don't even know they are using it.

No description has been provided for this image

The decline of Windows?¶

Windows usage has gone done, slowly, but there is a decrease. There are several reasons for this. The biggest is that people are changing how they use computers. Mobile computing have been changing people's perspective on what a computer is. More and more people now use mobile devices for a lot of their day-to-day activities.

No description has been provided for this image

Where Linux is King¶

As we have discussed before, Linux is king when it comes to HPCs and super computers. The graph below pretty much explains itself.

No description has been provided for this image

The Linux Family Tree¶

Linux is a derivitive on Unix and as such owes much of how it operates to the *nix family of operating systems. Unix itself has an interesting history as well.

No description has been provided for this image

Unix History¶

  • Developed in Bell Labs in the 1970s Over its development it had to adapt to multiple platforms
  • Originally it was written in Assembly but then moved to C to aid in the development
  • POSIX was created to standardize the OS
  • Many derivatives were developed from Unix

Linux Kernel¶

  • The kernel sits at the core of all Unix-like operating systems
  • It is the first program loaded into memory after booting and oversees all low-level communications
  • It also handles memory management, process management, scheduling, file management, input/output and network management
No description has been provided for this image

Handling a program¶

Each program may have one or more processes that will run. Linux is well suited for handling this and uses a PID (Process ID number) to track each one. Init is the very first process that starts up and so it always has a PID of 1.

No description has been provided for this image

Linux Shells¶

A GUI is certainly possible on Linux and many Linux distros designed for desktop use do use one. But, for much of the HPC and supercomputer usage, a Shell or Command Line Interface (CLI) is more often used. Most of them use Bash (Bourne-again shell) and that is what you will use in this class

No description has been provided for this image

Danger zone¶

Like anything, be careful! Unix commands often do not have very many checks or “are you sure” prompts. With Codespace, worse you will do is delete your own work. Same goes for the HPC (you don't have access to do any other damage). But on your own machine, or if you do happen to have root on a machine...

No description has been provided for this image

A few notes on the HPC¶

As I mentioned, we are going to switch over to using the HPC for our homework. We have a few things that we need to do to set this up though. Let's walk through starting a job on the HPC and clone our ICA to that.

Setting up our Python environment¶

Before you start anything we need to set up a virtual environment for Python. A virtual environment allows us to isolate what we run and configure it as we want it. We have done this already for you, but we do need to activiate this.

Let's begin with setting up the environment and then activating it.

source work/classtmp/AerE3610/bin/activate

Now that we have our environment, we need to add items to it. Right now, Python does not have any modules, let's change that. Let's start with setting up our Jupyter Notebook.

python3 -m pip install ipykernel --no-cache-dir

Now, we need to set up a config file for our Jupyter Kernel

python3 -m ipykernel install --user --name "AerE3610"

Please note that we only need to do this once. Your environment and config are stored in your home directory on the HPC. Now let's continue with a few things we can do in this environment.

Our final step is that we will want to tell Visual Studio Code where this is, and follow along with that.

Operating system commands in Python¶

Just about any programming language can usually access various aspects of the operating system. In Python, we can easily interact with the operating system using the OS and platform modules. To start, let's import both OS and platform. Make sure you run the code below or the rest of the code will not work.

Let's start with something simple, have Python read back what operating system it is currently running on. Put the following code in the box below.

platform.uname()
In [3]:
platform.uname()
Out[3]:
uname_result(system='Linux', node='nova22-21', release='5.14.0-427.28.1.el9_4.x86_64', version='#1 SMP PREEMPT_DYNAMIC Fri Jul 19 14:40:47 EDT 2024', machine='x86_64')

After you run the block above, you should see something that identifies your operating system, type of hardware and even the CPU type. Now, let's also see what version of Python you are running, put the following code in the block below.

platform.python_version()

As a note on this, you should probably be running a version of Python that is version 3.8 or higher. Most of you should be 3.9 or 3.10.

In [4]:
platform.python_version()
Out[4]:
'3.10.10'

This is great, but it is often useful to work with the filesystem and this is most often what you want Python or any other programming language to work with. This can be to read and write files and even create files and folders. For now, let's find out what the file path we have. We can use the following to determine the current directory structure.

os.getcwd()

Note that this will look different from computer to computer. It will depend if you are running Windows or a NIX OS like MacOS or Linux. It will also change based on where you are running this notebook.

In [5]:
os.getcwd()
Out[5]:
'/home/mnelson/aere361_lectures'

We can also list what files and folders we have in our current directory. Type in the following in the box below.

os.listdir()
In [6]:
os.listdir()
Out[6]:
['.gitignore',
 'slides_template.tpl',
 'Lecture_3.slides.html',
 'custom.css',
 'Module 3',
 'Module 4',
 'data',
 'Lecture_1.slides.html',
 'images',
 'AerE 361 Lecture 5 Documentation.ipynb',
 'Lecture_1.ipynb',
 'Lecture_2.ipynb',
 'Lecture_3.ipynb',
 'AerE 361 Lecture 6 - LaTeX and technical writing-student.ipynb',
 'Lecture_4.ipynb',
 'Lecture_4.slides.html',
 'Module 2',
 '.git',
 'README.md',
 'Lecture_2.slides.html']

Ok, let's combine some stuff we can use the following command to make a directory.

os.mkdir(path = "path location")

We want to make a new directory in the current directory we are in. We need to pass the full path of where we are currently. We could type this in manually, but then that wouldn't work in other setups where the location might change. Instead, we can get our current directory using the command we just used above. Now, we need to add the name of the directory we just called from the listdir command. That can easily be done by using the "+" to concatenate the strings. Use os.mkdir to make a new folder in this current folder called "AerE3610".

Hint: You will need an additional "/" before the name of the folder name you are trying to create.

In [10]:
os.mkdir(path = os.getcwd() + "/AerE3610")

Let's make sure this works. Run the code block below after running the one above. You should see your folder now listed.

In [11]:
os.listdir()
Out[11]:
['.gitignore',
 'slides_template.tpl',
 'Lecture_3.slides.html',
 'custom.css',
 'Module 3',
 'Module 4',
 'data',
 'Lecture_1.slides.html',
 'images',
 'AerE 361 Lecture 5 Documentation.ipynb',
 'Lecture_1.ipynb',
 'Lecture_2.ipynb',
 'AerE3610',
 'Lecture_3.ipynb',
 'AerE 361 Lecture 6 - LaTeX and technical writing-student.ipynb',
 'Lecture_4.ipynb',
 'Lecture_4.slides.html',
 'Module 2',
 '.git',
 'README.md',
 'Lecture_2.slides.html']

Bash commands¶

In our Jupyter Notebooks, we can also directly run Bash commands. This is done with using an "!" before the command we want to run. Let's run the following command:

!pwd

This should return what directory you are in now.

In [12]:
!pwd
/home/mnelson/aere361_lectures

Let's create a file in that folder we just created. In Linux, it is easy to create a file using touch. Another useful tool we can use the man command. This is short for manual and it let's us pull up information on most of the Linux commands we can use.

Use man, you can do this either in this notebook, or open up a terminal. Then, using touch, create a file in the AerE3610 we just created. Let's call it "3610.txt"

In [21]:
!man touch
TOUCH(1)                         User Commands                        TOUCH(1)

NAME
       touch - change file timestamps

SYNOPSIS
       touch [OPTION]... FILE...

DESCRIPTION
       Update  the  access  and modification times of each FILE to the current
       time.

       A FILE argument that does not exist is created empty, unless -c  or  -h
       is supplied.

       A  FILE  argument  string of - is handled specially and causes touch to
       change the times of the file associated with standard output.

       Mandatory arguments to long options are  mandatory  for  short  options
       too.

       -a     change only the access time

       -c, --no-create
              do not create any files

       -d, --date=STRING
              parse STRING and use it instead of current time

       -f     (ignored)

       -h, --no-dereference
              affect each symbolic link instead of any referenced file (useful
              only on systems that can change the timestamps of a symlink)

       -m     change only the modification time

       -r, --reference=FILE
              use this file's times instead of current time

       -t STAMP
              use [[CC]YY]MMDDhhmm[.ss] instead of current time

       --time=WORD
              change the specified time: WORD is access, atime, or use: equiv‐
              alent to -a WORD is modify or mtime: equivalent to -m

       --help display this help and exit

       --version
              output version information and exit

       Note that the -d and -t options accept different time-date formats.

DATE STRING
       The  --date=STRING  is  a mostly free format human readable date string
       such as "Sun, 29 Feb 2004 16:21:42 -0800" or "2004-02-29  16:21:42"  or
       even  "next Thursday".  A date string may contain items indicating cal‐
       endar date, time of day, time zone, day of week, relative  time,  rela‐
       tive date, and numbers.  An empty string indicates the beginning of the
       day.  The date string format is more complex than is easily  documented
       here but is fully described in the info documentation.

AUTHOR
       Written  by  Paul  Rubin, Arnold Robbins, Jim Kingdon, David MacKenzie,
       and Randy Smith.

REPORTING BUGS
       GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
       Report any translation bugs to <https://translationproject.org/team/>

COPYRIGHT
       Copyright © 2020 Free Software Foundation, Inc.   License  GPLv3+:  GNU
       GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
       This  is  free  software:  you  are free to change and redistribute it.
       There is NO WARRANTY, to the extent permitted by law.

SEE ALSO
       Full documentation <https://www.gnu.org/software/coreutils/touch>
       or available locally via: info '(coreutils) touch invocation'

GNU coreutils 8.32               January 2024                         TOUCH(1)
In [22]:
!touch ./AerE3610/3610.txt

HPC Testing¶

Since we are on the HPC, let's test a few things.

In [13]:
import multiprocessing as mp
print(f"Number of cpu: {mp.cpu_count()}")
Number of cpu: 64
In [14]:
import numpy as np
import time

def random_square(seed):
    np.random.seed(seed)
    random_num = np.random.randint(0, 10)
    return random_num**2
In [15]:
t0 = time.time()
results = []
for i in range(1000000): 
    results.append(random_square(i))
t1 = time.time()
print(f'Execution time {t1 - t0} s')
Execution time 5.402559280395508 s
In [16]:
t0 = time.time()
n_cpu = mp.cpu_count()

pool = mp.Pool(processes=n_cpu)
results = [pool.map(random_square, range(1000000))]
t1 = time.time()
print(f'Execution time {t1 - t0} s')
Execution time 2.939646005630493 s
In [17]:
import matplotlib.pyplot as plt
%matplotlib inline

def serial(n):
    t0 = time.time()
    results = []
    for i in range(n): 
        results.append(random_square(i))
    t1 = time.time()
    exec_time = t1 - t0
    return exec_time

def parallel(n):
    t0 = time.time()
    n_cpu = mp.cpu_count()

    pool = mp.Pool(processes=n_cpu)
    results = [pool.map(random_square, range(n))]
    t1 = time.time()
    exec_time = t1 - t0
    return exec_time
In [18]:
n_run = np.logspace(1, 7, num = 7)

t_serial = [serial(int(n)) for n in n_run]
t_parallel = [parallel(int(n)) for n in n_run]
In [19]:
plt.figure(figsize = (10, 6))
plt.plot(n_run, t_serial, '-o', label = 'serial')
plt.plot(n_run, t_parallel, '-o', label = 'parallel')
plt.loglog()
plt.legend()
plt.ylabel('Execution time (s)')
plt.xlabel('Number of random points')
plt.show()
No description has been provided for this image
In [20]:
from traitlets.config.manager import BaseJSONConfigManager
from pathlib import Path
path = Path.home() / ".jupyter" / "nbconfig"
cm = BaseJSONConfigManager(config_dir=str(path))
tmp = cm.update(
        "rise",
        {
            "theme": "moon",
            "transition": "fade",
            "start_slideshow_at": "selected",
            "autolaunch": True,
            "width": "100%",
            "height": "100%",
            "header": "",
            "footer":"",
            "scroll": True,
            "enable_chalkboard": True,
            "slideNumber": True,
            "center": False,
            "controlsLayout": "edges",
            "slideNumber": True,
            "hash": True,
            "custom_css": "custom.css"
        }
    )