X-Git-Url: https://git.friedersdorff.com/?a=blobdiff_plain;f=slides.rst;h=b4bba61bb071ecd294f8e2910861e94acc54b640;hb=4b392c68a9af343e9e5312e709c2cde6b6f6828f;hp=057a5d30bf4065a69d78f4a899e6fb88814ec85d;hpb=ce54c823b13f59665a381e086e7970ba22502fae;p=max%2Fintro_dice_and_pmgmnt.git diff --git a/slides.rst b/slides.rst index 057a5d3..b4bba61 100644 --- a/slides.rst +++ b/slides.rst @@ -1,158 +1,175 @@ -Plotting with Matplotlib ------------------------- - -Also creating a presentation with rst2pdf -========================================= - -Data Structures ---------------- -Favour simpler data structures if they do what you need. In order: - -#. Built-in Lists - - 2xN data or simpler - - Can't install system dependencies -#. Numpy arrays - - 2 (or higher) dimensional data - - Lots of numerical calculations -#. Pandas series/dataframes - - 'Data Wrangling', reshaping, merging, sorting, querying - - Importing from complex formats - -Shamelessly stolen from https://stackoverflow.com/a/45288000 - -Loading Data from Disk ----------------------- -Natively -======== - -.. code-block:: python - - >>> import csv - >>> with open('eggs.csv', newline='') as csvfile: - ... spam = csv.reader(csvfile, - ... delimiter=' ', - ... quotechar='|') - ... for row in spam: - ... # Do things - ... pass - -Loading Data from Disk ----------------------- -Numpy -===== - -.. code-block:: python - - >>> import numpy - >>> spam = numpy.genfromtxt('eggs.csv', - ... delimiter=' ', - ... dtype=None) # No error handling! - >>> for row in spam: - ... # Do things - ... pass - -``numpy.genfromtxt`` will try to infer the datatype of each column if -``dtype=None`` is set. - -``numpy.loadtxt`` is generally faster at runtime if your data is well formated -(no missing values, only numerical data or constant length strings) - -Loading Data from Disk ----------------------- -Numpy NB. -========= -**Remind me to look at some actual numpy usage at the end** - -- I think numpy does some type coercion when creating arrays. -- Arrays created by ``numpy.genfromtxt`` can not in general be indexed like - ``data[xstart:xend, ystart:yend]``. -- Data of unequal types are problematic! Pandas *may* be a better choice in - that case. -- Specifying some value for ``dtype`` is probably necessary in most cases in - practice: https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html - -Loading Data from Disk ----------------------- -Pandas -====== - -.. code-block:: python - - >>> import pandas - >>> # dtype=None is def - >>> spam = pandas.read_csv('eggs.csv', - ... delimiter=' ', - ... header=None) - >>> for row in spam: - ... # Do things - ... pass - -``header=None`` is required if the flie does not have a header. - - - -Generating Data for Testing ---------------------------- - -Generating the data on the fly with numpy is convenient. - -.. code-block:: python - - >>> import numpy.random as ran - >>> # For repeatability - >>> ran.seed(7890234) - >>> # Uniform [0, 1) floats - >>> data = ran.rand(100, 2) - >>> # Uniform [0, 1) floats - >>> data = ran.rand(100, 100, 100) - >>> # Std. normal floats - >>> data = ran.randn(100) - >>> # 3x14x15 array of binomial ints with n = 100, p = 0.1 - >>> data = ran.binomial(100, 0.1, (3, 14, 15)) - -Plotting Time Series --------------------- - -Plot data of the form: - -.. math:: y=f(t) - - -Subplots --------- - - -Saving Plots +Surviving phishing +------------------ +Password reuse, password managers and strong passwords +====================================================== +.. contents:: :depth: 1 + +Why is Password Reuse a Problem? +-------------------------------- +.. image:: password_reuse_1.png + :height: 6.5cm + +Consider the following hypothetical users that reuse a strong password in +most places and the following common scenario: + ++------------------+--------------------------+ +| User | Password | ++==================+==========================+ +| mark1@gmail.com | QUo5Qt+1Wa/Q1smDJRDbFg== | ++------------------+--------------------------+ +| mark2@gmail.com | +9Hz+/20rVkSkbcsmgdVFw== | ++------------------+--------------------------+ +| mark3@gmail.com | wnYkRcbi7Kkh7Fx2uR8EeA== | ++------------------+--------------------------+ + +#. User registers an account with a careless service, eg Facebook, Yahoo, + Google, Equifax etc. etc. +#. The service is hacked and the password and email is leaked +#. The hacker logs in to the email account +#. The hacker resets passwords on all important accounts tied to that email + address + + +About password strength +----------------------- + +How is strength measured? +========================= +'Entropy' `s` depends on the size of the alphabet `a` and the length `n` of the +password: + +.. math:: + s = log_2(a^n) + +* 0889234877724602 -> 53 bits +* ZeZJieatdH -> 60 bits + +Why are weak passwords problematic? +=================================== +Weak passwords are trivial to crack in many situations. A password with 53 bits +may be cracked by a criminal organisation in less than an hour. + + +What about strong passwords? +============================ +They are difficult to remember, a problem especially when you use a different +strong password for every service. You are also tempted to write them down, or +reuse them. + +It's surprisingly difficult for humans to generate good passwords! + +A strong password, as of 2019, has at least 80 bits of entropy. + +Password Managers to the Rescue! +-------------------------------- +Password managers allow you to create a unique and strong password for every +service. + +Additional benefits: + +* Remembers passwords for you +* Generates passwords for you +* Automagically fills in passwords on websites for you, this is important! +* Makes passwords available on all your configured devices +* Can store additional related data, usernames, answers to security questions, + pins for debit/credit cards + +Any of the mainstream password manager is equivalent in the above respects. + +Can you trust password managers? +-------------------------------- +Yes* + +How do they keep passwords secure? +---------------------------------- +1. User supplies a password +2. A slow function derives an encryption key +3. The encryption key is used to encrypt/decrypt your passwords + +Security of the encryption depends on the strengh of your +password: + ++---------+------------------------+ +| Entropy | Time to crack, | +| | assuming 1 second per | +| | attempt per typical | +| | CPU | ++=========+========================+ +| 50b | < 1 Month | ++---------+------------------------+ +| 60b | ~ 50 Years | ++---------+------------------------+ +| 70b | ~ 50,000 yers | ++---------+------------------------+ + +Generating a Strong Password +---------------------------- +Passphrases are better than passwords: + +* Tr0ub4dor&3 -> 28 bits of entropy, hard to remember +* correct horse battery stable -> 44 bits of entropy, easy to remember + +If you have to remember it, use a passphrase. + +Generate passphrases with Diceware_ +=================================== +1. Roll 5, 6 sided, *physical* dice +2. Read the numbers left to right +3. Find the word with that number on a list 6^5 (7776) words +4. Repeat until desired length is reached. For a password manager, use at + least 7. +5. Write down your passphrase on paper and keep it somewhere secure +6. If you are 100% confident that you will not forget the passphrase, destroy + the paper by burning + +What about phishing? +==================== +* A password manager will refuse to fill out a password on a spoofed website, + for instance faceb00k.com vs facebook.com +* Using different passwords on every service protects all other services even + if phishing is successful on one of them +* Good password managers will navigate to the login page for you, reducing the + risk of spoofed websites + + +Other advice ------------ +In no particular order: + +* Only log in on webpages that you navigated to by typing in the url yourself, + by searching on google, duckduckgo or some other reputable search engine or + from a bookmark. If after clicking a link in an email you are directed to a + log in page, it's probably a phishing attempt +* Only log in to webpages that are protected by SSL/TLS (HTTPS). Look for a + green address bar, or a green lock icon or similar in your browser +* Use two factor or two step authentication everywhere if possible +* Turn of automatic image rendering. Better still, disable HTML rendering and + authoring entirely +* Be suspicious of *all* emails. Risky things: HTML email, images, unknown + sender, poor spelling/grammer, 'Your email client can't display this email, + click here to view in your browser' or similar attempts to coerce you to click + on things -So far I've just displayed plots with ``plt.show()``. You can actually save -the plots from that interface manually, but when scripting, it's convenient -to do so automatically: - -.. code-block:: python - - >>> # Some plotting has previously occured - >>> plt.savefig('eggs.pdf', dpi=300, transparent=False) - -The output format is interpreted from the file extension. -The keyword arguments are optional here. Other options exist. - -Error Bars ----------- +Resources +--------- +`EFF notes on Diceware`_ They generally have good advice for these kinds of +topics. -Stacked Bar Graph ------------------ +`This Presentation`_ +`Keepass`_, an offline password manager -Resources ---------- -NumPy User Guide: https://docs.scipy.org/doc/numpy/user/index.html +`1Password`_, a pay to use password manager with some nice features -NumPy Reference: https://docs.scipy.org/doc/numpy/reference/index.html#reference +`LastPass`_, an online password manager with a gratis tier -Matplotlib example gallery: https://matplotlib.org/gallery/index.html +.. _Diceware: http://world.std.com/~reinhold/diceware.html +.. _EFF notes on Diceware: https://www.eff.org/dice +.. _This Presentation: https://git.friedersdorff.com/max/intro_dice_and_pmgmnt +.. _Keepass: https://keepass.info/ +.. _1Password: https://1password.com/ +.. _LastPass: https://www.lastpass.com/ -Pandas: It probably exists. Good luck. -This presentation: https://git.friedersdorff.com/max/plotting_with_matplotlib.git +.. target-notes::