Skip to content

Commit

Permalink
docs: rewrite reactivity guide (#3295)
Browse files Browse the repository at this point in the history
Rewrite the reactivity guide to be more accessible.

Also, give readers a tip that they can copy/paste notebook files into
the editor.
  • Loading branch information
akshayka authored Dec 26, 2024
1 parent da3b0a7 commit a985144
Show file tree
Hide file tree
Showing 5 changed files with 145 additions and 146 deletions.
9 changes: 6 additions & 3 deletions docs/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,9 +109,12 @@ def on_end(self, block: etree.Element) -> None:
# result = self.md.htmlStash.store(self.md.convert(md_text))
# container.text = result

container = etree.SubElement(details, "pre")
container.set("class", "marimo-source-code")
code_block = etree.SubElement(container, "code")
copy_paste_container = etree.SubElement(details, "p")
copy_paste_container.text = "Tip: paste this code into an empty cell, and the marimo editor will create cells for you"

code_container = etree.SubElement(details, "pre")
code_container.set("class", "marimo-source-code")
code_block = etree.SubElement(code_container, "code")
code_block.set("class", "language-python")
code_block.text = code

Expand Down
37 changes: 0 additions & 37 deletions docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,43 +3,6 @@ hide:
- navigation
---

# FAQ

- [Choosing marimo](#choosing-marimo)
- [How is marimo different from Jupyter?](#faq-jupyter)
- [What problems does marimo solve?](#faq-problems)
- [How is marimo.ui different from Jupyter widgets?](#faq-widgets)
- [Using marimo](#using-marimo)
- [Is marimo a notebook or a library?](#faq-notebook-or-library)
- [What's the difference between a marimo notebook and a marimo app?](#faq-notebook-app)
- [How does marimo know what cells to run?](#faq-reactivity)
- [Does marimo slow my code down](#faq-overhead)
- [How do I prevent automatic execution from running expensive cells?](#faq-expensive)
- [How do I disable automatic execution?](#faq-lazy)
- [How do I use sliders and other interactive elements?](#faq-interactivity)
- [How do I add a submit button to UI elements?](#faq-form)
- [How do I write markdown?](#faq-markdown)
- [How do I display plots?](#faq-plots)
- [How do I prevent matplotlib plots from being cut off?](#faq-mpl-cutoff)
- [How do I display interactive matplotlib plots?](#faq-interactive-plots)
- [How do I display objects in rows and columns?](#faq-rows-columns)
- [How do I show cell code in the app view?](#faq-show-code)
- [How do I create an output with a dynamic number of UI elements?](#faq-dynamic-ui-elements)
- [Why aren't my `on_change` handlers being called?](#faq-on-change-called)
- [Why are my `on_change` handlers in an array all referencing the last element?](#faq-on-change-last)
- [Why aren't my brackets in SQL working?](#faq-sql-brackets)
- [How do I restart a notebook?](#faq-restart)
- [How do I reload modules?](#faq-reload)
- [How does marimo treat type annotations?](#faq-annotations)
- [How do I use dotenv?](#faq-dotenv)
- [What packages can I use?](#faq-packages)
- [How do I use marimo on a remote server?](#faq-remote)
- [How do I make marimo accessible on all network interfaces?](#faq-interfaces)
- [How do I use marimo behind JupyterHub?](#faq-jupyter-hub)
- [How do I use marimo with JupyterBook?](#faq-jupyter-book)
- [How do I deploy apps?](#faq-app-deploy)
- [Is marimo free?](#faq-marimo-free)

## Choosing marimo

<a name="faq-jupyter"></a>
Expand Down
2 changes: 1 addition & 1 deletion docs/guides/expensive_notebooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ mo.stop(condition)
expensive_function_call()
```

Use [`mo.stop()`][marimo.stop] in conjunction with
Use [`mo.stop`][marimo.stop] with
[`mo.ui.run_button()`][marimo.ui.run_button] to require a button press for
expensive cells:

Expand Down
239 changes: 135 additions & 104 deletions docs/guides/reactivity.md
Original file line number Diff line number Diff line change
@@ -1,158 +1,189 @@
# Reactive execution
# Running cells

Every marimo notebook is a directed acyclic graph (DAG) that models how data
flows across blocks of Python code, i.e., cells.
marimo _reacts_ to your code changes: run a cell, and all other cells that
refer to the variables it defines are automatically run with the latest data.
This keeps your code and outputs consistent, and eliminates bugs before they
happen.

marimo _react_ to code changes, automatically executing cells with the latest
data. Execution order is determined by the DAG, not by the order of cells on
the page.
??? question "Why run cells reactively?"
marimo's "reactive" execution model makes your notebooks more reproducible
by eliminating hidden state and providing a deterministic execution order.
It also powers marimo's support for [interactive
elements](../guides/interactivity.md), for running as apps, and executing as
scripts.

Reactive execution is based on a single rule:

!!! important "Runtime Rule"
When a cell is run, marimo automatically runs all other cells that
**reference** any of the global variables it **defines**.
How marimo runs cells is one of the biggest differences between marimo and
traditional notebooks like Jupyter. Learn more at our
[FAQ](../faq.md#faq-jupyter).

!!! tip "Working with expensive notebooks"
marimo gives you tools that make it easy to work with expensive notebooks. For
example, the [runtime can be
configured](configuration/runtime_configuration.md) to be lazy, only
running cells when you ask for them to be run and marking affected cells as
stale instead of auto-running them. **See our guide on working with [expensive
notebooks](expensive_notebooks.md) for more tips.**
marimo provides tools for working with expensive notebooks, in which cells
might take a long time to run or have side-effects.

## References and definitions

A marimo notebook is a DAG where nodes are cells and edges are data
dependencies. marimo creates this graph by statically analyzing each cell
(i.e., without running it) to determine its

- references, the global variables it reads but doesn't define;
- definitions, the global variables it defines.
* The [runtime can be configured](configuration/runtime_configuration.md)
to be **lazy** instead of
automatic, marking cells as stale instead of running them.
* Use [`mo.stop`][marimo.stop] to conditionally
stop execution at runtime.

!!! tip "Global variables"
A variable can refer to any Python object. Functions, classes, and imported
names are all variables.
See [the expensive notebooks guide](expensive_notebooks.md) for more tips.

There is an edge from one cell to another if the latter cell references any
global variables defined by the former cell. The rule for reactive execution
can be restated in terms of the graph: when a cell is run, its descendants are
run automatically.
## How marimo runs cells

## Global variable names must be unique
marimo statically analyzes each cell (i.e., without running it) to determine
its

To make sure your notebook is DAG, marimo requires that every global
variable be defined by only one cell.
- references, the global variables it reads but doesn't define;
- definitions, the global variables it defines.

!!! important "Local variables"
Variables prefixed with an underscore are local to a cell (_.e.g._, `_x`). You
can use this in a pinch to fix multiple definition errors, but try instead to
refactor your code.
It then forms a directed acyclic graph (DAG) on cells, with an edge from
one cell to another if the latter references any of the definitions of the
former. When a cell is run, its descendants are marked for execution.

This rule encourages you to keep the number of global variables in your
program small, which is generally considered good practice.

## Local variables
!!! important "Runtime Rule"
When a cell is run, marimo automatically runs all other cells that
**reference** any of the global variables it **defines**.

Global variables prefixed with an underscore (_e.g._, `_x`) are "local" to a
cell: they can't be read by other cells. Multiple cells can reuse the same
local variables names.
marimo [does not track mutations](#variable-mutations-are-not-tracked) to
variables, nor assignments to attributes. That means that if you assign an
attribute like `foo.bar = 10`, other cells referencing `foo.bar` will _not_ be
run.

If you encapsulate your code using functions and classes when needed,
you won't need to use many local variables, if any.
### Execution order

## No hidden state
The order cells are executed in is determined by the relationships between
cells and their variables, not by the order of cells on the page (similar
to a spreadsheet). This lets you organize your code in whatever way makes the
most sense to you. For example, you can put helper functions at the bottom of
your notebook.

Traditional notebooks like Jupyter have _hidden state_: running a cell may
change the values of global variables, but these changes are not propagated to
the cells that use them. Worse, deleting a cell removes global
variables from visible code but _not_ from program memory, a common
source of bugs. The problem of hidden state has been discussed by
many others
[[1]](https://austinhenley.com/pubs/Chattopadhyay2020CHI_NotebookPainpoints.pdf)
[[2]](https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUhkKGvjtV-dkAIsUXP-AL4ffI/edit#slide=id.g362da58057_0_1).
### Deleting a cell deletes its variables

**marimo eliminates hidden state**: running
a cell automatically refreshes downstream outputs, and _deleting a cell
deletes its global variables from program memory_.
In marimo, _deleting a cell deletes its global variables from program memory_.
Cells that previously referenced these variables are automatically re-run and
invalidated (or marked as stale, depending on your [runtime
configuration](configuration/runtime_configuration.md)). In this way, marimo
eliminates a common cause of bugs in traditional notebooks like Jupyter.

<div align="center">
<!-- <div align="center">
<figure>
<img src="/_static/docs-delete-cell.gif"/>
<figcaption>No hidden state: deleting a cell deletes its variables.</figcaption>
</figure>
</div>
</div> -->

<a name="reactivity-mutations"></a>

## Avoid mutating variables
### Variable mutations are not tracked

marimo's reactive execution is based only on the global variables a cell reads
and the global variables it defines. In particular, _marimo does not track
mutations to objects_, _i.e._, mutations don't trigger reactive re-runs of
other cells. It also does not track the definition or mutation of object
attributes. For this reason, **avoid defining a variable in one cell and
marimo does not track mutations to objects, _e.g._, mutations like
`my_list.append(42)` or `my_object.value = 42` don't trigger reactive re-runs of
other cells. **Avoid defining a variable in one cell and
mutating it in another**.

??? note "Why not track mutations?"

Tracking mutations reliably is impossible in Python. Reacting to mutations
could result in surprising re-runs of notebook cells.

If you need to mutate a variable (such as adding a new column to a dataframe),
you should perform the mutation in the same cell as the one that defines it,
Or try creating a new variable instead.
or try creating a new variable instead.

### Examples
??? example "Create new variables, don't mutate existing ones"

**Create a new variable instead of mutating an existing one.**
=== "Do this ..."

_Don't_ do this:
```python
l = [1]
```

```python
l = [1]
```
```python
extended_list = l + [2]
```

```python
l.append(2)
```
=== "... not this"

_Instead_, do this:
```python
l = [1]
```

```python
l = [1]
```
```python
l.append(2)
```

```python
extended_list = l + [2]
```
??? example "Mutate variables in the cells that define them"

**Mutate variables in the cells that define them.**
=== "Do this ..."

_Don't_ do this:
```python
df = pd.DataFrame({"my_column": [1, 2]})
df["another_column"] = [3, 4]
```

```python
df = pd.DataFrame({"my_column": [1, 2]})
```

```python
df["another_column"] = [3, 4]
```
=== "... not this"

```python
df = pd.DataFrame({"my_column": [1, 2]})
```

_Instead_, do this:
```python
df["another_column"] = [3, 4]
```


## Global variable names must be unique

**marimo requires that every global variable be defined by only one cell.**
This lets marimo keep code and outputs consistent.

!!! tip "Global variables"
A variable can refer to any Python object. Functions, classes, and imported
names are all variables.


This rule encourages you to keep the number of global variables in your
program small, which is generally considered good practice.

### Creating temporary variables

marimo provides two ways to define temporary variables, which can
help keep the number of global variables in your notebook small.

#### Creating local variables

Variables prefixed with an underscore (_e.g._, `_x`) are "local" to a
cell: they can't be read by other cells. Multiple cells can reuse the same
local variables names.

#### Encapsulating code in functions

If you want most or all the variables in a cell to be temporary, prefixing each
variable with an underscore to make it local may feel inconvenient. In these
situations we recommend encapsulating the temporary variables in a function.

For example, if you find yourself copy-pasting the same plotting code across
multiple cells and only tweaking a few parameters, try the following pattern:

```python
df = pd.DataFrame({"my_column": [1, 2]})
df["another_column"] = [3, 4]
def _():
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot([1, 2])
return ax

_()
```

!!! note "Why not track mutations?"
Tracking mutations reliably is a fundamentally impossible task in Python; marimo
could never detect all mutations, and even if we could, reacting to mutations could
result in surprising re-runs of notebook cells. The simplicity of marimo's
static analysis approach, based only on variable definitions and references,
makes marimo easy to understand and encourages well-organized notebook code.
Here, the variables `plt`, `fig`, and `ax` aren't added to the globals.


## Runtime configuration
## Configuring how marimo runs cells

Through the notebook settings menu, you can configure how and when marimo runs
cells. In particular, you can disable autorun on startup, disable autorun
on cell execution, and enable a powerful module autoreloader. Read our
on cell execution, and enable a module autoreloader. Read our
[runtime configuration guide](configuration/runtime_configuration.md) to learn more.

## Disabling cells
Expand Down
4 changes: 3 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ markdown_extensions:
custom_fences:
- class: mermaid
name: mermaid
- pymdownx.tabbed:
alternate_style: true
- pymdownx.blocks.tab:
alternate_style: true
- pymdownx.tasklist:
Expand All @@ -74,7 +76,7 @@ nav:

- User Guide:
- Overview: guides/index.md
- Reactive execution: guides/reactivity.md
- Running cells: guides/reactivity.md
- Interactive elements: guides/interactivity.md
- Visualizing outputs: guides/outputs.md
- Migrating from Jupyter: guides/coming_from/jupyter.md
Expand Down

0 comments on commit a985144

Please sign in to comment.