From: Maximilian Friedersdorff Date: Thu, 30 May 2019 12:43:56 +0000 (+0100) Subject: Add content for slides X-Git-Url: https://git.friedersdorff.com/?a=commitdiff_plain;h=7d3423f860c17e76554c34fbfbb21f7ce4c6d595;p=max%2Fintro_dice_and_pmgmnt.git Add content for slides --- diff --git a/password_reuse_1.png b/password_reuse_1.png new file mode 100644 index 0000000..acd0b7a Binary files /dev/null and b/password_reuse_1.png differ diff --git a/password_reuse_2.png b/password_reuse_2.png new file mode 100644 index 0000000..4dea318 Binary files /dev/null and b/password_reuse_2.png differ diff --git a/password_reuse_3.png b/password_reuse_3.png new file mode 100644 index 0000000..e14b5dc Binary files /dev/null and b/password_reuse_3.png differ diff --git a/slides.rst b/slides.rst index 057a5d3..5e7d6ce 100644 --- a/slides.rst +++ b/slides.rst @@ -1,158 +1,85 @@ -Plotting with Matplotlib ------------------------- - -Also creating a presentation with rst2pdf -========================================= - -Data Structures ---------------- -Favour simpler data structures if they do what you need. In order: - -#. Built-in Lists - - 2xN data or simpler - - Can't install system dependencies -#. Numpy arrays - - 2 (or higher) dimensional data - - Lots of numerical calculations -#. Pandas series/dataframes - - 'Data Wrangling', reshaping, merging, sorting, querying - - Importing from complex formats - -Shamelessly stolen from https://stackoverflow.com/a/45288000 - -Loading Data from Disk ----------------------- -Natively -======== - -.. code-block:: python - - >>> import csv - >>> with open('eggs.csv', newline='') as csvfile: - ... spam = csv.reader(csvfile, - ... delimiter=' ', - ... quotechar='|') - ... for row in spam: - ... # Do things - ... pass - -Loading Data from Disk ----------------------- -Numpy -===== - -.. code-block:: python - - >>> import numpy - >>> spam = numpy.genfromtxt('eggs.csv', - ... delimiter=' ', - ... dtype=None) # No error handling! - >>> for row in spam: - ... # Do things - ... pass - -``numpy.genfromtxt`` will try to infer the datatype of each column if -``dtype=None`` is set. - -``numpy.loadtxt`` is generally faster at runtime if your data is well formated -(no missing values, only numerical data or constant length strings) - -Loading Data from Disk ----------------------- -Numpy NB. -========= -**Remind me to look at some actual numpy usage at the end** - -- I think numpy does some type coercion when creating arrays. -- Arrays created by ``numpy.genfromtxt`` can not in general be indexed like - ``data[xstart:xend, ystart:yend]``. -- Data of unequal types are problematic! Pandas *may* be a better choice in - that case. -- Specifying some value for ``dtype`` is probably necessary in most cases in - practice: https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html - -Loading Data from Disk ----------------------- -Pandas -====== - -.. code-block:: python - - >>> import pandas - >>> # dtype=None is def - >>> spam = pandas.read_csv('eggs.csv', - ... delimiter=' ', - ... header=None) - >>> for row in spam: - ... # Do things - ... pass - -``header=None`` is required if the flie does not have a header. - - - -Generating Data for Testing ---------------------------- - -Generating the data on the fly with numpy is convenient. - -.. code-block:: python - - >>> import numpy.random as ran - >>> # For repeatability - >>> ran.seed(7890234) - >>> # Uniform [0, 1) floats - >>> data = ran.rand(100, 2) - >>> # Uniform [0, 1) floats - >>> data = ran.rand(100, 100, 100) - >>> # Std. normal floats - >>> data = ran.randn(100) - >>> # 3x14x15 array of binomial ints with n = 100, p = 0.1 - >>> data = ran.binomial(100, 0.1, (3, 14, 15)) - -Plotting Time Series --------------------- - -Plot data of the form: - -.. math:: y=f(t) - - -Subplots --------- - - -Saving Plots ------------- - -So far I've just displayed plots with ``plt.show()``. You can actually save -the plots from that interface manually, but when scripting, it's convenient -to do so automatically: - -.. code-block:: python - - >>> # Some plotting has previously occured - >>> plt.savefig('eggs.pdf', dpi=300, transparent=False) - -The output format is interpreted from the file extension. -The keyword arguments are optional here. Other options exist. - -Error Bars ----------- - - -Stacked Bar Graph ------------------ - - -Resources ---------- -NumPy User Guide: https://docs.scipy.org/doc/numpy/user/index.html - -NumPy Reference: https://docs.scipy.org/doc/numpy/reference/index.html#reference - -Matplotlib example gallery: https://matplotlib.org/gallery/index.html - -Pandas: It probably exists. Good luck. - -This presentation: https://git.friedersdorff.com/max/plotting_with_matplotlib.git +Why is Password Reuse a Problem? +-------------------------------- +.. image:: password_reuse_1.png +.. image:: password_reuse_2.png +.. image:: password_reuse_3.png + +About password strength +----------------------- +How is strength measured? +========================= +'Entropy' `s` depends on the size of the alphabet `a` and the length `n` of the +password: + +.. math:: + s = log_2(a^n) + +* 0889234877724602 -> 53 bits +* ZeZJieatdH -> 60 bits + +Why are weak passwords problematic? +=================================== +Weak passwords are trivial to crack in many situations. A password with 53 bits +may be cracked by a criminal organisation in less than an hour. + + +What about strong passwords? +============================ +They are difficult to remember, a problem especially when you use a different +strong password for every service. You are also tempted to write them down, or +reuse them. + +It's surprisingly difficult for humans to generate good passwords! + +Password Managers to the Rescue! +-------------------------------- +Password managers allow you to create a unique and strong password for every +service. + +Additional benefits: + +* Remembers passwords for you +* Generates passwords for you +* Automagically fills in passwords on websites for you, this is important! +* Makes passwords available on all your configured devices +* Can store additional related data, usernames, answers to security questions, + pins for debit/credit cards + +Any of the mainstream password manager is equivalent in the above respects. + +Can you trust password managers? +-------------------------------- +Yes* + +How do they keep passwords secure? +---------------------------------- +1. User supplies a password +2. The password is used to derive an encryption key. This process is designed + to be slow, even on modern hardware +3. The so generated encryption key is used to encrypt/decrypt your passwords + +Note that the security of the encryption depends on the strengh of your +password. With a poor password (50 bits), it would take the entire computing +power of the world less than a month to crack the database. With a decent ish +password (60 bits), it would take on the order of 50 years on average. With a +better password (70 bits), it would take on the order of 50,000 years. + +Generating a Strong Password +---------------------------- +Passphrases are better than passwords: + +* Tr0ub4dor&3 -> 28 bits of entropy, hard to remember +* correct horse battery stable -> 44 bits of entropy, easy to remember + +Use passphrases everywhere you have to remember. + +Generate passphrases with Diceware +================================== +1. Roll 5, 6 sided, *physical* dice +2. Read the numbers left to right +3. Find the word with that number on a list 6^5 (7776) words +4. Repeat until desired length is reached. For a password manager, use at + least 7. +5. Write down your passphrase on paper and keep it somewhere secure +6. If you are 100% confident that you will not forget the passphrase, destroy + the paper by burning