Introduction
This repository has been created as part of the course work for the Fundamentals of Data Analysis module in the Higher Diploma in Computer Programming in Data Analytics provided by Atlantic Technical University.
Purpose
This repository has two jupyter notebooks, one for practicles element to the course and one for the assignement on Normal Distribution.
System Requirments
To run or modify the notebooks on a local machine requires the latest version of Python, Anaconda is an easy to use version available on Windows, Mac or Linux operating systems. Alternatively there are a number of web based version available.
Running Jupyter Notebooks
The following link provides information on how to launch Jupyter notebook from a terminal.
https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/execute.html
Practicles Notebook
This notebook contains the various exercises (regular tasks) given throughout the semester. This notebook has been set up so that the corresponding exercises are contained in their respective Topic.
Normal Distribution Notebook
This notebook is about normal distribution, where it is defined, concepts explained and the use of visuals.
General References:
- Datacamp - numerous courses/tracks completed over last number of months have supported this exercise
- Udemy course: https://www.udemy.com/course/the-modern-python3-bootcamp/learn/lecture/8680110?start=94#overview
- W3schools - Resource used on regular basis: https://www.w3schools.com/
- Stackoverflow - Resource used to help troubleshoot problems and help with coding: https://stackoverflow.com/
Practicles Notebook References:
- matplotlyb.plyplot:https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.html
- pandas: https://pandas.pydata.org/
- numpy: https://numpy.org/doc/stable/index.html
Topic 1: Information
Exercise 1:
- Lecturer notebook: https://github.com/ianmcloughlin/2223-S1-fund-data-analysis/blob/main/notebooks/01-information.ipynb
- Lectures to support topic on Moodle: https://vlegalwaymayo.atu.ie/course/view.php?id=6386
- Troubleshooting: https://stackoverflow.com/
- Troubleshooting: https://www.w3schools.com/
- Troubleshooting: https://www.youtube.com/
Exercise 2:
- https://en.wikipedia.org/wiki/Logarithm
- https://www.britannica.com/science/logarithm
- https://www.geeksforgeeks.org/logarithm-formula/
- https://www.w3schools.com/python/ref_math_log.asp
Topic 2: Randomness
Exercise 1:
Exercise 2:
Exercise 3:
Uniform:
- https://numpy.org/doc/stable/reference/random/generated/numpy.random.uniform.html
Gamma: - https://www.statology.org/gamma-distribution-in-python/
Poisson: - https://www.statology.org/poisson-distribution-python/
Topic 3: Bias
Exercise 1:
- https://www.kdnuggets.com/2020/06/five-cognitive-biases-data-science.html
- https://www.youtube.com/watch?v=wEwGBIr_RIw
- https://towardsdatascience.com/cognitive-biases-facing-data-scientists-86489e99dea8
Exercise 2:
Topic 4: Outliers
Exercise 1:
- https://stackoverflow.com/questions/31842892/how-to-add-labels-to-a-boxplot-figure-pylab
- https://stackoverflow.com/questions/61734304/label-outliers-in-a-boxplot-python
- https://towardsdatascience.com/create-and-customize-boxplots-with-pythons-matplotlib-to-get-lots-of-insights-from-your-data-d561c9883643
- https://www.nickmccullum.com/python-visualization/boxplot/
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.boxplot.html
Exercise 2:
- https://www.w3schools.com/python/matplotlib_intro.asp
- https://www.w3schools.com/python/matplotlib_plotting.asp
Exercise 3:
- https://www.mathsisfun.com/algebra/directly-inversely-proportional.html
- https://neptune.ai/blog/pandas-plot-deep-dive-into-plotting-directly-with-pandas
- https://www.oreilly.com/library/view/python-data-science/9781491912126/ch04.html
- https://stackoverflow.com/questions/2051744/how-to-invert-the-x-or-y-axis
- https://stackoverflow.com/questions/66103601/inverse-y-axis-in-python-scatter-plt
Topic 5: Cleansing
Exercise 1:
Exercise 2:
- https://learnpython.com/blog/uppercase-letter-python/#:~:text=capitalize()&text=It%20is%20used%20just%20like%20the%20title()%20method.&text=We%20know%20the%20capitalize(),method%20to%20capitalize%20each%20word.
- https://stackoverflow.com/a/1393367
Normal Distribution Notebook References:
- https://www.geeksforgeeks.org/how-to-plot-normal-distribution-over-histogram-in-python/
- https://www.statology.org/generate-normal-distribution-python/
- https://www.w3schools.com/python/python_ml_normal_data_distribution.asp
- https://en.wikipedia.org/wiki/LaTeX
- https://studiousguy.com/real-life-examples-normal-distribution/
- https://www.middlesex.mass.edu/ace/downloads/tipsheets/normal_cf.pdf
- https://statisticsbyjim.com/basics/normal-distribution/
Readme file editing:
- https://medium.com/analytics-vidhya/the-jupyter-notebook-formatting-guide-873ab39f765e
- https://www.freecodecamp.org/news/how-to-write-a-good-readme-file/
Contact:
G00217642@atu.ie