Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new random concept #3556

Merged
merged 30 commits into from
Dec 7, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
4ee91c6
new `random` concept
colinleach Nov 29, 2023
2947331
changed unused variables to underscore
colinleach Nov 30, 2023
28cc392
Update concepts/random/about.md
colinleach Dec 6, 2023
d0ac75d
Update concepts/random/about.md
colinleach Dec 6, 2023
4093f7a
Update concepts/random/about.md
colinleach Dec 6, 2023
33a5f90
Update concepts/random/about.md
colinleach Dec 6, 2023
96d1a60
Update concepts/random/about.md
colinleach Dec 6, 2023
999b443
Update concepts/random/about.md
colinleach Dec 6, 2023
18c185b
Update concepts/random/about.md
colinleach Dec 6, 2023
b318855
Update concepts/random/about.md
colinleach Dec 6, 2023
e1b6bba
Update concepts/random/about.md
colinleach Dec 6, 2023
f409376
Update concepts/random/about.md
colinleach Dec 6, 2023
4a87cdc
Update concepts/random/about.md
colinleach Dec 6, 2023
4756c7f
Update concepts/random/about.md
colinleach Dec 6, 2023
07bf0a8
Update concepts/random/about.md
colinleach Dec 6, 2023
d188e10
Update concepts/random/about.md
colinleach Dec 6, 2023
6485653
Update concepts/random/about.md
colinleach Dec 6, 2023
3b7c53c
Update concepts/random/about.md
colinleach Dec 6, 2023
2e38e78
Update concepts/random/about.md
colinleach Dec 6, 2023
d355185
Update concepts/random/about.md
colinleach Dec 6, 2023
2aaa718
Update concepts/random/about.md
colinleach Dec 6, 2023
5cf1dff
Update concepts/random/about.md
colinleach Dec 6, 2023
b801b10
Update concepts/random/about.md
colinleach Dec 6, 2023
df91e31
Update concepts/random/about.md
colinleach Dec 6, 2023
7ccdf36
Update concepts/random/about.md
colinleach Dec 6, 2023
72e2d7f
Update concepts/random/about.md
colinleach Dec 6, 2023
d8ffe3b
Merge branch 'main' into random
colinleach Dec 6, 2023
fed3a66
Added Introduction.md and Links
BethanyG Dec 6, 2023
666435b
Small touchups and link fixes
BethanyG Dec 7, 2023
15a9211
More Typo Fixes
BethanyG Dec 7, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions concepts/random/.meta/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"blurb": "The random module contains functionality to generate random values for modelling, simulations and games. It should not be used for security or cryptographic applications.",
"authors": ["bethanyg", "colinleach"],
"contributors": []
}
171 changes: 171 additions & 0 deletions concepts/random/about.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# About

Many programs need (apparently) random values to simulate real-world events.

Common, familiar examples include:
- A coin toss: a random value from `('H', 'T')`.
- The roll of a die: a random integer from 1 to 6.
- Shuffling a deck of cards: a random ordering of a card list.

Generating truly random values with a computer is a surprisingly difficult technical challenge, so you may see these results referred to as "pseudorandom".

In practice, a well-designed library like the [`random`][random] module in the Python standard library is fast, flexible, and gives results that are amply good enough for most applications in modelling, simulation and games.

The rest of this page will list a few of the most common functions in `random`.

We encourage you to explore the full [`random`][random] documentation, as there are many more options.

## Important Warning!

The `random` module should __NOT__ be used for security and cryptographic applications.

Instead, Python provides the [`secrets`][secrets] module.
This is specially optimized for cryptographic security.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

## Create random integers

The `randrange()` function has three forms, to select a random value from `range(start, stop, step)`:
- `randrange(stop)` gives an integer `n` such that `0 <= n < stop`
- `randrange(start, stop)` gives an integer `n` such that `start <= n < stop`
- `randrange(start, stop, step)` gives an integer `n` such that `start <= n < stop` and `n` is in the sequence `start, start + step, start + 2*step...`

For the common case where `step == 1`, the `randint(a, b)` function may be more convenient and readable.

Possible results from `randint()` _include_ the upper bound, so `randint(a, b)` is the same as `randrange(a, b+1)`.

```python
>>> import random

>>> random.randrange(500)
219
>>> [random.randrange(0, 10, 2) for _ in range(10)]
[2, 8, 4, 0, 4, 2, 6, 6, 8, 8]

>>> random.randint(1, 6) # roll a die
4
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigh. I want to keep this here, buuuut. If this is going to be "high" on the tree, then we can't use a list comprehension here, and instead have to do loop-append. But that being said, this will also be before loops. So maybe we keep it....I will have to think about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit I wondered about that after submitting it, without reaching a clear conclusion.

Another thing I worry about: these are high up but absolutely depend on import. How do we resolve that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minutes later, with a bit more thought: this would be a killer error in introduction.md. For about.md I think we have more leeway to forward-reference concepts the students haven't reached yet. May need an explanatory comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing I worry about: these are high up but absolutely depend on import. How do we resolve that?

Errr. Yeah. That said, import is really easy to show by example. At least that's what I tell myself every time it comes up. But if it is also raising concerns for you, we might want to brainstorm on how we address that.

One option, of course, is to move concepts around. I like the idea of keeping math related stuff in a cluster, but I also don't want to send anyone screaming for the exit. Now, none of these are on the "critical path" of prerequisites, but a lot of folx might very well go through them because they are there.

Option two would be to briefly show & explain how you do an import, and hope that we don't get complaints about the hand-waving. I am not sure we would, since its really straightforward. Something along the lines of:

 To use the `random` module, you must first import it like so:

   ```python
   import random

  random.choice([1,2,3,4,5,6,7])
  ```

Option three would be to write up an import concept. I am allergic to this for several reasons:

  1. Doing it early means skipping the (important) details around namespacing, names, and aliases.
  2. If we leave out namespacing, names, and aliases -- what is left to include that isn't just an example?
  3. Even if we do a really good job, how much of the detail will help students, and how much will they retain?

I think I might be in favor of option two. What are your thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I might be in favor of option two

Agreed, with the caveat that we may also need from xxx import yyy. I've used that a lot in datetime, though not in the numbers cluster.

Next step: when do we discourage from xxx import *. Presumably, after namespaces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I might say immediately. But certainly after namespaces.


## Working with sequences

The functions in this section assume that you are starting from some sequence.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

This will typically be a `list`, or with some limitations a `tuple` or `set` (`tuple` is immutable, and `set` is unordered).
colinleach marked this conversation as resolved.
Show resolved Hide resolved

### `choice()` and `choices()`

The `choice()` function will return one random entry from a sequence.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

At its simplest, the coin-flip example:
colinleach marked this conversation as resolved.
Show resolved Hide resolved

```python
>>> [random.choice(['H', 'T']) for _ in range(5)]
['T', 'H', 'H', 'T', 'H']
```
colinleach marked this conversation as resolved.
Show resolved Hide resolved

We could do essentially the same with the `choices()` function, supplying a keyword argument with the list length:
colinleach marked this conversation as resolved.
Show resolved Hide resolved

```python
>>> random.choices(['H', 'T'], k=5)
['T', 'H', 'T', 'H', 'H']
```

We assumed a fair coin with equal probability of heads or tails.
colinleach marked this conversation as resolved.
Show resolved Hide resolved
Weights can also be specified.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

For example, if a bag contains 10 red balls and 15 green balls, and we pull one out at random:
colinleach marked this conversation as resolved.
Show resolved Hide resolved

```python
>>> random.choices(['red', 'green'], [10, 15])
['red']
```

### `sample()`

The `choices()` example above assumes what statisticians call "sampling with replacement". Each choice has no effect on the probability of future choices.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

For example, in the example with red and green balls: after each choice, we return the ball to the bag and shake well before the next choice.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

In a situation where we pull out a red ball and _it stays out_, there are now fewer red balls in the bag and the next choice is less likely to be red.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

To simulate this "sampling without replacement", we have the `sample()` function.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

The syntax of `sample()` is similar to choices, except with `counts` as a keyword parameter:
colinleach marked this conversation as resolved.
Show resolved Hide resolved

```python
>>> random.sample(['red', 'green'], counts=[10, 15], k=10)
['green', 'green', 'green', 'green', 'green', 'red', 'red', 'red', 'red', 'green']
```

Samples are listed in the order they were chosen.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

### `shuffle()`

Both `choices()` and `sample()` return new lists when `k > 1`.

In contrast, `shuffle()` randomizes the order of a list _in place_.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

```python
>>> my_list = [1, 2, 3, 4, 5]
>>> random.shuffle(my_list)
>>> my_list
[4, 1, 5, 2, 3]
```

The original ordering is lost.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

## Working with distributiions

Until now, we have concentrated on cases where all outcomes are equally likely.

For example, `random.randrange(100)` is equally likely to give any integer from 0 to 99.

Many real-world situations are less simple than this. Statisticians have created a wide variety of `distributions` to describe the results mathematically.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

### Uniform distributions

For integers, `randrange()` and `randint()` are used when all probabilities are equal. This is called a `uniform` distributuion.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

There are floating-point equivalents to `randrange()` and `randint()`.

__`random()`__ gives a `float` value `x` such that `0.0 <= x < 1.0`.

__`uniform(a, b)`__ gives `x` such that `a <= x <= b`.

```python
>>> [round(random.random(), 3) for _ in range(5)]
[0.876, 0.084, 0.483, 0.22, 0.863]

>>> [round(random.uniform(2, 5), 3) for _ in range(5)]
[2.798, 2.539, 3.779, 3.363, 4.33]
```

### Gaussian distribution

Also called the "normal" or "bell-shaped" curve, this is a very common way to describe imprecision in measured values.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

For example, suppose the factory where you work has just bought 10,000 bolts which should be identical.
You want to set up the factory robot to handle them, so you weigh a sample of 100 and find that they have an average (or `mean`) weight of 4.731g.

This is extremely unlikely to mean that they all weigh exactly 4.731g.
Perhaps you find that values range from 4.627 to 4.794g but cluster around 4.731g.

This is the [`Gaussian distribution`][gaussian-distribution], for which probabilities peak at the mean and tails off symmetrically on both sides (hence "bell-shaped").

To simulate this in software, we need some way to specify the width of the curve (typically, expensive bolts will cluster more tightly around the mean than cheap bolts!)

By convention, this is done with the [`standard deviation`][standard-deviation]: small values for a sharp, narrow curve, large for a low, broad curve.

Mathematicians love Greek letters, so we use `mu` for the mean and `sigma` for the standard deviation.
colinleach marked this conversation as resolved.
Show resolved Hide resolved
Thus, if you read that "95% of values are within 2-sigma of the mean" or "the Higgs boson has been detected with 5-sigma confidence", such comments relate to the standard deviation.
colinleach marked this conversation as resolved.
Show resolved Hide resolved

```python
>>> mu = 4.731
>>> sigma = 0.316
>>> [round(random.gauss(mu, sigma), 3) for _ in range(5)]
[4.72, 4.957, 4.64, 4.556, 4.968]
```

[random]: https://docs.python.org/3/library/random.html
[secrets]: https://docs.python.org/3/library/secrets.html
[gaussian-distribution]: https://ned.ipac.caltech.edu/level5/Leo/Stats2_3.html
[standard-deviation]: https://www.nlm.nih.gov/oet/ed/stats/02-900.html
colinleach marked this conversation as resolved.
Show resolved Hide resolved
1 change: 1 addition & 0 deletions concepts/random/introduction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
#TODO: Add introduction for this concept.
6 changes: 6 additions & 0 deletions concepts/random/links.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[
{
"url": "https://docs.python.org/3/library/random.html/",
"description": "Official documentation for the random module."
}
]
5 changes: 5 additions & 0 deletions config.json
Original file line number Diff line number Diff line change
Expand Up @@ -2567,6 +2567,11 @@
"uuid": "565f7618-4552-4eb0-b829-d6bacd03deaf",
"slug": "with-statement",
"name": "With Statement"
},
{
"uuid": "af6cad74-50c2-48f4-a6ce-cfeb72548d00",
"slug": "random",
"name": "Random"
}
],
"key_features": [
Expand Down
Loading