-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
babfa9c
commit f17375d
Showing
8 changed files
with
114 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
(creating-log-files)= | ||
# Creating log files | ||
|
||
In order to document that you have actually run your code, a log file, a transcript, or some other evidence, may be useful. It may even be required by certain journals. | ||
|
||
## TL;DR | ||
|
||
- Log files are a way to document that you have run your code. | ||
- In particular for code that runs for a very long time, or that uses data that cannot be shared, log files may be the only way to document basic reproducibility. | ||
|
||
## Overview | ||
|
||
Most statistical software has ways to keep a record that it has run, with the details of that run. Some make it easier than others. In some cases, you may need to instruct your code to be "verbose", or to "log" certain events. In other cases, you may need to use a command-line option to the software to create a log file. | ||
|
||
```{warning} | ||
I do note that we are typically only looking to document what the statistical code does, at a high level. We are not looking to document system calls, fine-grained data access, etc. Computer scientists and IT security mavens may be interested in such details, but economists are typically not. | ||
``` | ||
|
||
## Examples | ||
|
||
### Explicit log files | ||
|
||
We start by describing how to explicitly generate log files as part of the statistical processing code. | ||
|
||
::::{tab-set} | ||
|
||
|
||
:::{tab-item} Stata | ||
|
||
global logdir "${rootdir}/logs" | ||
cap mkdir "$logdir"` | ||
local c_date = c(current_date) | ||
local cdate = subinstr("`c_date'", " ", "_", .) | ||
local c_time = c(current_time) | ||
local ctime = subinstr("`c_time'", ":", "_", .) | ||
local globallog = "$logdir/logfile_`cdate'-`ctime'-`c(username)'.log" | ||
log using "`globallog'", name(global) replace text | ||
|
||
::: | ||
|
||
:::: | ||
|
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,9 @@ | ||
# Use a new computer | ||
|
||
Some authors may have a fresh, or extra, computer lying around. Use that to download the replication package, and see if it runs. | ||
The ultimate isolated environment is an otherwise untouched computer. Some authors, or the IT departments in the institution that authors are affiliated with, may have a fresh, new, recently imaged computer lying around, with the relevant software (say, Stata or Python) already installed. | ||
|
||
```{tip} | ||
If you ever change institutions, employers, or simply buy a new laptop - this might be you! | ||
``` | ||
|
||
Use such an "unblemished" computer to download the replication package, and see if it runs. Keep tabs of what additional configuration steps you need to do that you had not thought of. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,38 @@ | ||
# Use of containers | ||
|
||
|
||
## TL;DR | ||
|
||
- Containers are a way to simulate a "computer within a computer", which can be used to run code in an isolated environment. They are relatively lightweight, and are starting to be used as part of replication packages in economics. | ||
- They do not work in all situations, and require some more advanced technical skills. | ||
- Using containers to test for reproducibility is easier, and should be considered as part of a toolkit. | ||
- Several online services make such testing (and development) easy. | ||
|
||
## Overview | ||
|
||
Coming soon. | ||
|
||
Containers can be shared via online systems (Docker Hub, Singularity Hub, etc.), or via files (`.tar` files, etc.). While the former is convenient, the latter is more robust for archival purposes. | ||
|
||
```{warning} | ||
Commercial container sharing services regularly purge containers from their services if they are not actively used, or if a subscription is not maintained. While the core infrastructure containers, such as for Python or R, are likely to be maintained for a long time, commercial companies can change their preservation policies at any time, with little warning. | ||
``` | ||
|
||
|
||
## Examples | ||
|
||
```bash | ||
docker run -it --rm \ | ||
-v "$(pwd)":/project \ | ||
-w /project \ | ||
dataeditors/stata17:2023-08-29 \ | ||
-b do main.do | ||
``` | ||
``` | ||
|
||
## Additional resources | ||
|
||
- [Docker](https://www.docker.com/) is a free, open-source container manager, which allows users to create containers using "recipes" (called `Dockerfiles`). While the underlying technology is usually Linux, [Docker Desktop](https://www.docker.com/products/docker-desktop) (commercial, free for most academic uses) allows users to run containers on Windows, macOS, and Linux. | ||
- [OrbStack](https://www.orbstack.com/) is a container manager for macOS (commercial, free for typical academic usage). It is compatible with Docker. | ||
- [Apptainer](https://www.apptainer.io/), formerly known as [Singularity](https://sylabs.io/singularity/), free, open-source container manager. It can use Docker images, but has its own syntax for "recipes". It is fundamentally Linux based, and available on many university HPC clusters. | ||
|
||
Various other container managers are available for both Linux and Windows (Azure) based clouds (`podman`, etc.). They should all be able to run Docker containers. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
(virtual-machines)= | ||
# Virtual Machines | ||
|
||
|
||
## TL;DR | ||
|
||
- Virtual machines are a way to create a "computer within a computer", which can be used to run code in an isolated environment. They are not usually part of replication packages in economics, and I do not suggest you use them as such, but they can be used to test replication packages. | ||
|
||
## Overview | ||
|
||
For sake of completeness, we will mention that you can also achieve the same outcome as using a brand-new computer by using a virtual machine, on your own system. Virtual machines are routinely used in computer science and other domains (including as class assignments in CS courses). Basic software is free, and there are standards on sharing virtual machine files (the specifications and the actual contents). | ||
|
||
```{warning} | ||
Virtual machines are not typically used in economics, and in particular not as a key component of replication packages. They are presented here primarily as an advanced tool to **test** replication packages. | ||
``` | ||
|
||
## Examples | ||
|
||
None at this point. | ||
|
||
## Additional resources | ||
|
||
- [Oracle VirtualBox](https://www.virtualbox.org/) is a free, open-source virtual machine software, originally developed by Sun Microsystems. It is available for Windows, macOS, and Linux. | ||
- [VMWare Workstation Player](https://www.vmware.com/products/workstation-player.html) is commercial virtual machine software, with a free "player" version for Windows and Linux | ||
|
||
Naturally, one would like to have virtual machines be reproducibly created, and this is possible using tools such as: | ||
|
||
- [Vagrant](https://www.vagrantup.com/) is a free, open-source virtual machine manager, which allows users to create virtual machines using "recipes", similar to Dockerfiles. It is available for Windows, macOS, and Linux. | ||
- [Multipass](https://multipass.run/) is a free, open-source virtual machine manager. While it can only handle creating Linux VMs, the tool itself is available for Windows, macOS, and Linux. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters