These instructions were written for the TACO version of october 2021 and was run with python-version 3.8 and R-version 4.1.1
Some steps can be skipped if the software has already been installed previously
- Download TACO
- Download Anaconda -> Python (specify adding anaconda to PATH-variable, still possible to add manually later)
- Install the following packages:
argparse
numpy
pandas
scipy
sklearn
emcee
lightkurve
loguru
Two methods (in Anaconda Prompt!):conda install packagename
ORpip install packagename
- Install R (https://cran.r-project.org/)
- Open the RGui as Administrator and install the following packages by executing
install.packages("packagename")
in the R console:argparser
ifultools
numDeriv
lomb
tidyverse
modelr
- Check that all packages are in the R-library (in directory C:\Program Files\R\R-X.X.X\library)
- Install Git (https://git-scm.com/)
- Update the path variable: search for 'Environment variables' and select 'Edit environment variables' (open as Admin, otherwise no right to edit system variables!), select
Add
next to system variables. Paths to be added (if not already present):C:\Program Files\Git\cmd
C:\Program Files\Git\bin\bash.exe
C:\Program Files\R\R-X.X.X\bin
C:\Users\Your username\Anaconda3\Scripts
C:\Users\Your username\Anaconda3\Library\bin
C:\Users\Your username\Anaconda3
(Optional) Download Visual Studio Code (https://code.visualstudio.com/)
No special recommandations during the installation
Once installed open VSC and go to Extensions (ctrl+shift+x) and add following extensions if not already installed:
- Python
- R
- Excel Viewer
(not required but useful when interpreting the data from .csv-files)
- Install R (https://cran.r-project.org/) and install following packages by executing
install.packages("packagename")
in the R console (to open console runR
in terminal):argparser
ifultools
numDeriv
lomb
tidyverse
modelr
- Install Anaconda (for Linux installer: https://www.anaconda.com/products/individual#linux ; for documentation: https://docs.anaconda.com/anaconda/install/linux/)
- Install the following packages:
argparse
numpy
pandas
scipy
sklearn
emcee
lightkurve
loguru
Two methods (in terminal):conda install packagename
ORpip install packagename
- Download and unpack TACO
Use the shellscript TACO.sh to analyse a star.
-
Create a specific folder for each new star in the directory
data
. Copy the.dat
-file of the desired star into the folder of the star and rename this file toraw.dat
. All output for this star will be inside this folder. Copy./TACO.sh
into the directory where the dataset of the star is located (you might have to change all python3 to python). -
a. Open VSC and start a Git Bash in the terminal window: In terminal window (bottom window usually), click on down arrow next to plus sign and select
Git Bash
In the bash terminal: go to the folder of your star (whereraw.dat
is located) usingcd xxx/xxx/xxx
b. Open a Git Bash terminal (right click) in the directory of the star.
-
Run the command
./TACO.sh
in the Git bash terminal. You might have to run the following in the bash-terminal:conda init bash
Note: By adding or removing additional lines and/or input parameters in ./TACO.sh
, one can finetune the analysis for each dataset.
Use the shellscript TACO.sh to analyse a star.
- Open a terminal
- Create a specific folder for each new star in the directory
data
. Copy the.dat
-file of the desired star into the folder of the star and rename this file toraw.dat
. All output for this star will be inside this folder. Copy./TACO.sh
into the directory where the dataset of the star is located (you might have to change all python3 to python). - In the terminal: go to the folder of your star (where
raw.dat
is located) usingcd
All output for this star will be inside this folder. - Copy
./TACO.sh
into the directory where the dataset of the star is located (you might have to change all python3 to python).
Note: By adding or removing additional lines and/or input parameters in ./TACO.sh
, one can finetune the analysis for each dataset.
Filter the lightcurves with a high-pass filter (e.g. 40 days) in the form of a triangular smooth of the provided width. Additionally it interpolates the single-point missing values. It also calculates the mean and variance of the light curve and saves them in the summary file. I assume the light curve is as provided by the APOKASC group which is an ascii table with three columns representing the time (in days), the observed ppm and the error (which is ignored from now on).
usage: Rscript filter_lightcurve.R [--] [--help] [--opts OPTS] [--tseries TSERIES] [--output OUTPUT] [--summary SUMMARY] [--width WIDTH] [--remove_gaps REMOVE_GAPS]
Make a triangular filter of the time-series.
flags:
-h, --help show this help message and exit
optional arguments:
--opts OPTS RDS file containing argument values
--tseries TSERIES File name of the time-series in tsv format with the first column giving the time and the second the flux. [default: raw.dat]
--output OUTPUT File name on which to save the filtered time-series [default: filtered.csv]
--summary SUMMARY Summary file to save some time-series statistics [default: summary.csv]
--width WIDTH Width of the triangular filter [default: 40]
--remove_gaps REMOVE_GAPS Remove gaps greater than this value (in days). If set to -1, do not remove gaps. [default: -1]
Calculate the Lomb-Scargle periodogram of a time series normalized using the spectral window function as described by Kallinger et al. 2014. The input lightcurve is assumed to have time stamps given in days, the flux in ppm and the periodogram is given in units of micro-Hertz. The functionality for computing the periodogram comes from the lightkurve
package.
usage: python pds.py [--] [--help] [--tseries TSERIES] [--output OUTPUT] [--oversampled_output OVERSAMPLED_OUTPUT] [--ofac OFAC] [--summary SUMMARY]
Make Lomb-Scargle periodogram. We assume the input time is given in days and give the periodogram frequencies in micro-Hertz.
flags:
-h, --help show this help message and exit
optional arguments:
--tseries TSERIES File name of the time-series in csv format with two columns: `time` and `flux`. [default: filtered.csv]
--output OUTPUT File name of the saved periodogram. [default: pds.csv]
--oversampled_output OVERSAMPLED_OUTPUT File name on which to save the oversampled periodogram. [default: ofac_pds.csv]
--ofac OFAC Oversampling factor to use in oversampled periodogram [default: 2]
--summary SUMMARY Summary file to save some periodogram statistics [default: summary.csv]
Make three independent possible estimates of the frequency of maximum oscillation power (nu_max) and compares them to get a final guess. The estimates are based on the following
- Variance of the time series: It was shown by Hekker et al. 2012 that the variance of the time series has an exponential relation with nu_max.
- Peak detections with a tree-map of the mexican-hat continuous wavelet transform: Using method by Du et al. 2006 to detect significant peaks in a data set.
- Morlet wavelet transfroms: The maximum of a Morlet wavelet transform usually occurs near numax.
Updates the summary file with new columns: numax_var
, numax_CWTTree
, numax_Morlet
and numax0
which are the 3 different nu_max estimations and the final guess respectively.
It also adds a column numax0_flag
. If true, the three estimates are different.
usage: Rscript numax_estimate.R [--] [--help] [--opts OPTS] [--pds PDS] [--summary SUMMARY]
Make a rough numax estimation using the periodogram and the variance of the time-series.
We assume that the summary file contains the variance of the time-series and the Nyquist
frequency of the periodogram in columns named 'var' and 'nuNyq', respectively.
flags:
-h, --help show this help message and exit
optional arguments:
-x, --opts OPTS RDS file containing argument values
-p, --pds PDS File name of the PDS in csv format with headers 'frequency' and 'power' [default: pds.csv]
-s, --summary SUMMARY Summary file in csv format with columns named 'var' and 'nuNyq'. [default: summary.csv]
Use an MCMC algorithm (emcee) to make a background estimation for the power-spectrum density normalized according with the above prescription. Note that this script uses some initial guesses that critically depend on the normalization of the power-spectrum density so, if you use a different normalization, this script might need some changes before it's able to work. You need to have installed the emcee
python package (version 3 or greater).
The background model is similar to the one proposed by Kallinger et al. 2015 albeit with some differences. We use by default a unbinned version of the PDS to do the fit but this can be changed with the command-line arguments.
usage: background_fit.py [-h] [--pds PDS] [--summary SUMMARY]
[--posterior POSTERIOR]
[--logfile LOGFILE] [--nwalkers NWALKERS]
[--nwarmup NWARMUP] [--minsteps MINSTEPS]
[--maxsteps MAXSTEPS] [--bins BINS]
Make an MCMC background fit.
optional arguments:
-h, --help show this help message and exit
--pds PDS A csv file with the periodogram on which to make the
fit
--summary SUMMARY A csv file with a 'numax0' column giving an initial
numax estimation
--save_posteriors A flag which when given saves out the posterior distributions
(i.e. MCMC chains) to the file fiven in the `--posterior` flag.
Otherwise if not given then they are not saved.
--posterior POSTERIOR
Destination on which to write the MCMC posterior
(collapsed chains)
--logfile LOGFILE Location on which to write the MCMC log
--nwalkers NWALKERS Number of walkers (chains) for the MCMC fit
--nwarmup NWARMUP Number of steps for the MCMC warmup
--minsteps MINSTEPS Minimum number of steps for the MCMC estimation
--maxsteps MAXSTEPS Maximum number of steps for the whole MCMC run
--bins BINS The PDS will be binned by this number of bins for the
MCMC. Setting it to 1 will not bin it.
By default TACO does NOT save out the chains/posterior distributions from the background fit (since the file is ~50-100 MB per star), only the summary statistics. If you explicitly want to save the MCMC chains then please add the --save_posteriors
flag when running the background fit.
Takes the background-removed power-spectrum density and identifies the relevant oscillations. The oscillations are all modelled as lorentzians. This script estimates their position, width and height which can be later used to construct a prior in a more precise analysis. You need to have installed the R packages: wmtsa
, argparser
and dplyr
. The auxiliary file peakFind_lib.R
must be present in the same directory as peakFind.R
.
usage: Rscript peakFind.R [--] [--help] [--opts OPTS] [--pds PDS] [--ofac_pds OFAC_PDS][--summary SUMMARY] [--output OUTPUT] [--snr SNR] [--prob PROB] [--maxlwd MAXLWD] [--removel02 REMOVEL02] [--peaks PEAKS] [--mixedpeaks MIXEDPEAKS] [--minAIC MINAIC] [--navg NAVG]
Peak-finding in a background-removed PDS. We only look for peaks in the region numax +- 4*sigmaEnv.
flags:
-h, --help show this help message and exit
optional arguments:
-x, --opts OPTS RDS file containing argument values
-p, --pds PDS File name of the background-removed PDS in csv format with headers 'frequency' and 'power' [default: pds_bgr.csv]
--ofac_pds OFAC_PDS File name of the background-removed PDS in csv format with headers 'frequency' and 'power' [default: ofac_pds_bgr.csv]
-s, --summary SUMMARY File name of the summary file in csv format with at least the headers 'numax' and 'sigmaEnv' [default: summary.csv]
-o, --output OUTPUT File name on which to save the results [default: peaks.csv]
--snr SNR Minimum signal-to-noise ratio (on CWT space) for resolved peaks [default: 2]
--prob PROB Minimum (frequentist) probability threshold for unresolved peaks [default: 1e-04]
-m, --maxlwd MAXLWD Maximum search linewidth for resolved peaks in the CWT search [default: 1]
--removel02 REMOVEL02 Whether or not the l02 peaks should be divided out before running the CWT search [default: FALSE]
--peaks PEAKS File name of the identified peaks. It must contain the l=0,2 modes already identified [default: peaksMLE.csv]
--mixedpeaks MIXEDPEAKS File name of the csv file containing the mixed mode peaks from peak finding [default: mixedpeaks.csv]
--minAIC MINAIC Minimum AIC value for a peak to have in order to be considered significant [default: 2]
--navg NAVG Number of power spectra averaged to create current power spectrum [default: 1]
Makes an MLE parameter optimisation taking the peaks found previously with peakFind.R
usage: Rscript peaksMLE.R [--] [--help] [--opts OPTS] [--pds PDS] [--init INIT] [--summary SUMMARY] [--mixedpeaks MIXEDPEAKS] [--mixedoutput MIXEDOUTPUT] [--output OUTPUT] [--maxlwd MAXLWD] [--minAIC MINAIC] [--removel02 REMOVEL02] [--finalfit FINALFIT] [--navg NAVG]
MLE estimation of the peaks found in a background-removed PDS.
flags:
-h, --help show this help message and exit
optional arguments:
-x, --opts OPTS RDS file containing argument values
-p, --pds PDS File name of the background-removed PDS in csv format with headers 'frequency' and 'power' [default: pds_bgr.csv]
-i, --init INIT File name of the intial guesses [default: peaks.csv]
-s, --summary SUMMARY File name of the csv file with summary values. It must contain numax and sigmaEnv [default: summary.csv]
--mixedpeaks MIXEDPEAKS File name of the csv file containing the mixed mode peaks from peak finding [default: mixedpeaks.csv]
--mixedoutput MIXEDOUTPUT File name of the csv file to save the mixed mode MLE results to [default: mixedpeaksMLE.csv]
-o, --output OUTPUT File name on which to save the results [default: peaksMLE.csv]
-m, --maxlwd MAXLWD Maximum linewidth for peaks in the MLE optimization [default: 0.5]
--minAIC MINAIC Minimum AIC value for a peak to have in order to be considered significant [default: 2]
--removel02 REMOVEL02 Whether or not the l02 peaks should be divided out before running the CWT search [default: FALSE]
--finalfit FINALFIT Whether or not this is the final MLE optimisation [default: FALSE]
--navg NAVG Number of power spectra averaged to create current power spectrum [default: 1]
Takes the oscillations found and tags them according to their spherical degree. It also calculate the large frequency separation Δν
usage: Rscript peakBagModeId02.R [--] [--help] [--opts OPTS] [--peaks PEAKS] [--summary SUMMARY] [--pds PDS] [--output OUTPUT]
Mode identification for the peaks found.
flags:
-h, --help show this help message and exit
optional arguments:
-x, --opts OPTS RDS file containing argument values
-p, --peaks PEAKS File name of the identified peaks [default: peaksMLE.csv]
-s, --summary SUMMARY File name of the csv file with summary values. It must contain numax and sigmaEnv [default: summary.csv]
--pds PDS File name of the csv file with the PDS It must contain the columns 'frequency' and 'power [default: pds_bgr.csv]
-o, --output OUTPUT File name of the csv file on which to write the results [default: peaksMLE.csv]
Computes the period spacing and coupling using the power spectrum of the stretched power spectrum from Vrard et al. (2016). This will also require the sloscillations
python package for the fast mixed mode calculations.
usage: python peakBagPeriodSpacing.py [--peaks PEAKS] [--summary SUMMARY] [--pds PDS] [--full_pds FULLPDS] [--maxiters MAXITERS] [--niters NITERS] [--dpi_only DPIONLY] [--ncores NCORES]
Period spacing and coupling factor calculation
flags:
-h, --help show this help message and exit
optional arguments:
--peaks PEAKS File name of the identified peaks [default: peaksMLE.csv]
--summary SUMMARY File name of the csv file with summary values (it must contain the parameters numax and sigmaEnv) [default: summary.csv]
--pds PDS File name of the csv file with the background-subtracted PDS It must contain the columns "frequency" and "power" [default: pds_bgr.csv]
--full_pds FULLPDS File name of the csv file with the original PDS It must contain the columns "frequency" and "power" [default: pds.csv]
--maxiters MAXITERS Maximum number of iterations to use in the self-consistent calculation of ΔΠ1 [default: 10]
--niters NITERS Number of iterations to repeat the calculation of ΔΠ1, q and ε_g [default: 5]
--dpi_only DPIONLY Only infer the period spacing and don't calculate tau or q [default: FALSE]
--ncores NCORES Number of cores to use for a parallel calculation [default: 1]
Computes the rotational splitting using the echelle of the stretched power spectrum roughly following Gehan (2018). This will also require the sloscillations
python package for the fast mixed mode calculations.
usage: python peakBagRotation.py [--peaks PEAKS] [--summary SUMMARY] [--pds PDS] [--full_pds FULLPDS] [--output OUTPUT] [--ncores NCORES] [--cut CUT]
Period spacing and coupling factor calculation
flags:
-h, --help show this help message and exit
optional arguments:
--peaks PEAKS File name of the identified peaks [default: peaksMLE.csv]
--summary SUMMARY File name of the csv file with summary values (it must contain the parameters numax and sigmaEnv) [default: summary.csv]
--pds PDS File name of the csv file with the background-subtracted PDS It must contain the columns "frequency" and "power" [default: pds_bgr.csv]
--full_pds FULLPDS File name of the csv file with the original PDS It must contain the columns "frequency" and "power" [default: pds.csv]
--output OUTPUT File name of the csv file on which to write the results [default: peaksMLE.csv]
--ncores NCORES Number of cores to use for a parallel calculation [default: 1]
--cut CUT Cut in zeta for rotational splitting finder [default: 0.7]