This project has provided an environment setting file of conda, users can easily reproduce the environment by the following commands:
git clone https://github.com/chengtan9907/OpenSTL
cd OpenSTL
conda env create -f environment.yml
conda activate OpenSTL
python setup.py develop # or `pip install -e .`
Requirements
- Linux (Windows is not officially supported)
- Python 3.7+
- PyTorch 1.8 or higher
- CUDA 10.1 or higher
- NCCL 2
- GCC 4.9 or higher
Dependencies
- argparse
- dask
- decord
- fvcore
- hickle
- lpips
- matplotlib
- netcdf4
- numpy
- opencv-python
- packaging
- pandas
- python<=3.10.8
- scikit-image
- scikit-learn
- torch
- timm
- tqdm
- xarray==0.19.0
Note:
-
Installation errors.
- If you are installing
cv2
for the first time,ImportError: libGL.so.1
will occur, which can be solved byapt install libgl1-mesa-glx
. - Errors might occur with
hickle
and this dependency when using KittiCaltech dataset. You can solve the issues by installing additional packages according to the output message. - As for WeatherBench, you encounter some import or runtime errors in the version of
xarray
. You can install the latest version orxarray==0.19.0
to solve the errors, i.e.,pip install xarray==0.19.0
, and then install required packages according to error messages. - Please use Python<=3.10.x to prevent the error of timm,
ValueError: mutable default <class 'timm.models.maxxvit.MaxxVitConvCfg'> for field conv_cfg is not allowed: use default_factory
. Refer to issue #1530 in issue #62.
- If you are installing
-
Following the above instructions, OpenSTL is installed on
dev
mode, any local modifications made to the code will take effect. You can install it bypip install .
to use it as a PyPi package, and you should reinstall it to make the local modifications effect.
It is recommended to symlink your dataset root (assuming $YOUR_DATA_ROOT
) to $OPENSTL/data
. If your folder structure is different, you need to change the corresponding paths in config files.
We support following datasets: Human3.6M [download], KTH Action [download], KittiCaltech Pedestrian [download], Moving MNIST [download], TaxiBJ [download], WeatherBench [download]. Please prepare datasets with tools and scripts under tools/prepare_data
. You can also download the version we used in experiments from Baidu Cloud (kjfk). Please do not distribute the datasets and only use them for research. After all, the related datasets under $OPENSTL/data
will look like this:
OpenSTL
├── configs
└── data
├── caltech
│ ├── set06
│ ├── ...
│ ├── set10
│ ├── data_cache.npy
│ ├── indices_cache.npy
├── human
| ├── images
| ├── test.txt
| ├── train.txt
├── kinetics400
│ ├── annotations
│ ├── replacement
│ ├── test
│ ├── train
│ ├── val
|── kitti_hkl
| ├── sources_test_mini.hkl
| ├── ...
| ├── X_train.hkl
│ ├── X_val.hkl
|── kth
| ├── boxing
| ├── ...
| ├── walking
|── moving_fmnist
| ├── fmnist_test_seq.npy
| ├── train-images-idx3-ubyte.gz
|── moving_mnist
| ├── mnist_test_seq.npy
| ├── train-images-idx3-ubyte.gz
├── softmotion30_44k
│ ├── test
│ ├── train
|── taxibj
| ├── dataset.npz
|── weather
| ├── 2m_temperature
| ├── ...
|── weather_1_40625deg
| ├── 2m_temperature
| ├── ...
Moving MNIST and Moving FMNIST are toy datasets, which generate gray-scale videos (64x64 resolutions) with two objects. We provide download_mmnist.sh and download_mfmnist.sh, which download datasets from MMNIST download and MFMNIST download. Note that the train set is generated online while the test set is fixed to ensure the consistency of evaluation results. We provided the combised version of MNIST and CIFAR-10 and the noise versions of Moving MNIST (dynamic / missing / perceptual) in the dataset implementation.
The BAIR dataset uses BAIR Robot Pushing as the train set (648960 videos) and the test set (3840 videos). We provide download_bair.sh to prepare the datasets, and you can also download the data from BAIR download. The data preprocessing of RGB videos (64x64 resolutions) and experiment settings are adopted from PredRNN.
The KittiCaltech Pedestrian dataset uses Kitti Pedestrian as the train set (2042 videos) and uses Caltech Pedestrian as the test set (1983 videos). We provide download_kitticaltech.sh to prepare the datasets. The data preprocessing of RGB videos (128x160 resolutions) and experiment settings are adopted from PredNet.
The KTH Action dataset contains grey-scale videos (resizing 160x120 to 128x128 resolutions) of six types of human actions performed several times by 25 subjects in four different scenarios. It has 5200 and 3167 videos for the train and test sets and can be downloaded from KTH download, which are in the avi
format. For convinience, we use the image version released in PredRNN and provide download_kth.sh to prepare the dataset. The data preprocessing and experiment settings are adopted from KTH and PredRNN.
The Human3.6M dataset contains high-resolution videos (1024x1024 resolutions) of seventeen scenarios of human actions performed by eleven professional actors, which can be downloaded from Human3.6M download. We provide download_human3.6m.sh to prepare the dataset. We borrow the train and test splitting files from STRPM but use 256x256 resolutions in our experiments.
The Kinetics-400 dataset contains real-world human action videos (around 256x320 resolutions) of 400 human actions classes, with at least 400 video clips for each action. Each clip lasts around 10s and is taken from a different YouTube video. It has 246534 and 39805 videos for the train and test sets, which can be downloaded from Kinetics download. We provide download_kinetics.sh to prepare the dataset according to kinetics-dataset. Similar to Human 3.6M, we use 256x256 resolutions in our experiments for faster training.
WeatherBench is the publicly available dataset for global weather prediction, which can be downloaded and processed from WeatherBench download. We choose some important weather variants with certain vertical levels and resolutions, e.g., 2m_temperature, relative_humidity, and total_cloud_cover. You can download the specific dataset of WeatherBench with download_weatherbench.sh. Note that 5.625deg
and 1.40625deg
indicate 32x64 and 128x256 resolutions, and the data can have multiple channels.
TaxiBJ is a popular traffic trajectory prediction dataset, which contains the trajectory data (32x32) in Beijing collected from taxicab GPS with two channels, which can be downloaded from OneDrive. We provide download_taxibj.sh to prepare the dataset, or you can download it from Baidu Cloud. We borrow the data preprocessing scripts from DeepST and provide the processed data in our Baidu Cloud.