In [ ]:
%%HTML
<link rel="stylesheet" type="text/css" href="https://raw.githubusercontent.com/malkaguillot/Foundations-in-Data-Science-and-Machine-Learning/refs/heads/main/docs/utils/custom.css">
%%HTML
<link rel="stylesheet" type="text/css" href="../utils/custom.css">

Foundations in Data Science and Machine Learning¶

Module 2: Basic Python¶

Malka Guillot¶

HSG Logo

Executing notebooks in vscode¶

1. List available kernels¶

  • similarly to running code, you select the interpreter, and then you can run the notebook
No description has been provided for this image

2. Select the kernel¶

No description has been provided for this image

3. Run the notebook¶

No description has been provided for this image

Assignment and scalar types¶

Assignment and scalar types¶

  • Representing numbers: integers and floats
  • Using Python like a calculator
  • Comparing variables
  • Representing True and False: Booleans

Integer¶

  • Variables are assigned with a single = sign
In [2]:
a = 3
  • Types are inferred, not declared upfront
  • Types can be inspected with type()
In [3]:
type(a)
Out[3]:
int
In [4]:
type(3)
Out[4]:
int
  • You can re-assign variables with different values
  • Ints can hold arbitrarily large numbers
In [5]:
a = 5

Float¶

  • Floats represent real numbers
In [6]:
b = 3.141
type(b)
Out[6]:
float
  • They are imperfect representations
    • Imperfect precision
    • Can hold values between $-10^{308}$ and $10^{308}$
In [7]:
c = 0.1+0.2
c
Out[7]:
0.30000000000000004

Python as a calculator¶

  • Arithmetic works as you would expect
  • Brackets work as expected
  • Mixing ints and floats converts everything to floats
In [8]:
a = 3 
b = 3.1415
print(b/a)
print( (a+b)*3)
1.0471666666666668
18.424500000000002

Some things you need to know¶

  • ** is exponentiation (not ^ )
  • // is floored quotient division
  • % yields the remainder of a division
In [9]:
a**b
Out[9]:
31.54106995953402
In [10]:
b//a
Out[10]:
1.0
In [11]:
b%a
Out[11]:
0.14150000000000018

Comparisons¶

  • Comparison operators are == , < , > , <= , >=
  • Remember: = is used for assignment, not comparison
  • The result of a comparison is a Boolean
In [12]:
a = 3
b = 3 
print(a == b)
print(a < b)
print(a >= b)
True
False
True

Booleans¶

  • Booleans can be True or False (case sensitive)
In [13]:
a = True
b = False
type(a)
Out[13]:
bool
  • and , or and not can be used to express complex conditions
In [14]:
a and b
Out[14]:
False
In [15]:
a or b 
Out[15]:
True
In [16]:
not b 
Out[16]:
True
  • Fundamental for control flow we will see later

Strings¶

Assigning strings¶

  • Strings can hold arbitrary text data
In [17]:
a = "Hello"
type(a)
Out[17]:
str
  • Defined with single or double quotes
In [18]:
b = 'embed "double" quotes'
c = "embed 'single' quotes"
  • Strings containing numbers do not behave like numbers!
In [19]:
not_an_int = "123"
type(not_an_int)
Out[19]:
str

Everything is an object $\rightarrow$ Everything has methods¶

  • Any language has int , float , bool and string
  • C, Fortran, …:
    • low level types to store data efficiently and do fast calculations
  • Python: Everything is an object
    • Objects with convenient methods
    • Trade efficiency for convenience
    • We can still get efficient when needed!

Some string methods¶

In [20]:
a = "Hello World!"
a.lower()
Out[20]:
'hello world!'
In [21]:
a.replace("!",".")
Out[21]:
'Hello World.'
In [22]:
a.startswith("Hello")
Out[22]:
True

Strings are a sequence type¶

  • Most of the time, you can think of strings as scalar variables
  • They are actually sequences of characters
    • Have a length
    • Can be indexed
    • Can be sliced
    • Can be iterated over
  • Indexing starts at 0
  • Negative indices start from the end
In [23]:
a = "Hello World!"
len(a)
Out[23]:
12
In [24]:
a[0]
Out[24]:
'H'
In [25]:
a[-1]
Out[25]:
'!'

Lists (and tuples)¶

Lists¶

  • Created with square brackets
In [26]:
a = [1,2,3]
type(a) 
Out[26]:
list
In [27]:
a.append(4) # the append method adds an element to the end of the list
  • Definition: Mutable sequence of objects
    • mutable: Can change it after creation
    • sequence: An ordered collection
    • of objects: Items can consist of anything
In [28]:
a[0] = "here"
a
Out[28]:
['here', 2, 3, 4]
  • len works for all collections
In [29]:
len(a)
Out[29]:
4
  • Lists are used a lot!
  • Highly optimized for fast appending!

Tuples¶

  • Created with round brackets
  • Definition: Immutable sequence of objects
    • immutable: Cannot change after creation
  • Single element tuples need a comma
  • But sometimes you don’t need the brackets
    • Less flexible than lists, less common
  • Somewhat unfair:
    • immutable: often helps to prevent bugs
      • hashable: can use in more locations

Selecting elements¶

  • Selecting elements is the same for lists, tuples, and other sequences
  • Indexing starts at 0
In [30]:
a = [1, 2, 3, 4, 5]
a[0]
Out[30]:
1
  • Upper index of slices is not included
In [31]:
a[1: 2]
Out[31]:
[2]
  • lower and upper index can be left implicit
In [32]:
a[:2]
Out[32]:
[1, 2]
In [33]:
a[2:]
Out[33]:
[3, 4, 5]
  • negative indices start from the end
In [34]:
a[-1]
Out[34]:
5

Dictionaries¶

Dictionaries¶

  • Map a set of keys to a set of values
    • Dictionary is a mapping type
  • Creation by curly braces and : to separate keys and values
  • mutable: Can add or overwrite entries
  • Order is preserved (since Python 3.6)
In [35]:
# Create a dictionary
a = {'a': 1, 'b': 2, 'c': 3}
type(a)
Out[35]:
dict
In [36]:
# Access a value using square brackets
a['b']
Out[36]:
2

What can go in a dict?¶

  • Keys need to be hashable, for example strings ints tuples
  • Values can be absolutely anything
  • If values are dicts we get nested dictionaries
In [37]:
nested = {
    'a': {
        'b': {
            'c': 3,
            'd': 'hello'
            }
        }
    }

# Chain access a nested value
nested['a']['b']['c']
Out[37]:
3

When to use dictionaries?¶

  • Dictionaries provide label based access
  • Lists provide position based access
  • Label based access is more readable and less error prone!
  • Example use-cases:
    • Storing model specifications
    • Storing results of your anlysis
    • ...

Tracebacks¶

  • Traceback: Detailed report that helps you to localize the error
    • Pro tip: Read the traceback!
In [38]:
d = {'a':1}
d[[1, 2, 3]] = "b"
d["c"]=2
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[38], line 2
      1 d = {'a':1}
----> 2 d[[1, 2, 3]] = "b"
      3 d["c"]=2

TypeError: unhashable type: 'list'

The code above has a problem¶

  • Traceback tells us everything we need:

    1. What type of Exception occurred: TypeError
    2. Where did it occur: In line 2 of Cell
    3. What happened exactly (used an unhashable type where we must not)
  • Tracebacks can get very long! Read from bottom to top.

  • Always look for these three things!

Common sources of errors¶

  • ValueError : Called a function with something invalid

  • KeyError : Typo in a variable name or a dictionary key

  • TypeError : Called a function with something that has the wrong type

  • ImportError : Typo in an import

How to ask for help?¶

  • I do not remember what task 3 in exercise 5 is
  • I like to see that you tried on your own
  • I like to see that you tried to reduce the amount I have to read
  • I love well formatted, self-contained examples
    • Always include the traceback

Further references¶

  • See an excellent blogpost by Matthew Rocklin that shows how to ideally ask for help.
  • See a realpython post on how to read tracebacks

Executing .py files in VS Code¶

Open a file¶

No description has been provided for this image

Solution 1: Run you script in the terminal¶

Open the terminal in VS Code and run your script with python <script.py>

No description has been provided for this image

Solution 1: Run you script in the terminal¶

No description has been provided for this image

Solution 2: Select the python interpreter¶

  • using the command palette (Ctrl + Shift + P)
    • Select interpreter: you can select the environment you want to use
No description has been provided for this image

You now are working in your virtual environment¶

No description has been provided for this image

You can also run using VS code's button¶

No description has been provided for this image - nice debugger functionality