Skip to content

Commit

Permalink
feat: setting up github actions to run on every push
Browse files Browse the repository at this point in the history
  • Loading branch information
simojo committed Apr 25, 2024
1 parent 75ec93c commit 77eec77
Show file tree
Hide file tree
Showing 3 changed files with 27 additions and 17 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
name: Release Senior Thesis
on:
push:
tags:
- '*.*.*'
on: [push]

jobs:
publish:
runs-on: ubuntu-latest
Expand Down Expand Up @@ -37,6 +35,7 @@ jobs:
- name: Create release
id: create_release
uses: actions/create-release@v1
if: startsWith(github.ref, 'refs/tags/')
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
Expand All @@ -47,6 +46,7 @@ jobs:
- name: Upload released asset
id: upload-release-asset
uses: actions/upload-release-asset@v1
if: startsWith(github.ref, 'refs/tags/')
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
Expand Down
24 changes: 12 additions & 12 deletions abstract.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,15 @@ surroundings in a real world environment, and it is necessary to realize
technologies such as fully autonomous unmanned aerial vehicles (UAVs) and land
vehicles. Reinforcement Learning (RL) has proven to be a novel and effective
method for autonomous navigation and control, as it is capable of optimizing a
method of converting its instantaneous state to an action at a point in time
[@gugan2023; @song2023; @doukhi2022]. Here we use a Deep Deterministic Policy
Gradient (DDPG) RL algorithm to train the COEX Clover quadcopter system to
perform autonomous navigation. With the advent of solid state lasers,
miniaturized optical ranging systems have become ubiquitous for aerial robotics
because of their low power and accuracy [@raj2020]. By equipping the Clover with
ten Time of Flight (ToF) ranging sensors, we supply continuous spatial data in
combination with inertial data to determine the quadcopter's state, which is
then mapped to its control output. Our results suggest that, while the DDPG
algorithm is capable of training a quadcopter system for autonomous navigation,
its computation-heavy nature leads to delayed convergence, and relying on
discretized algorithms may permit more rapid convergence across episodes.
method of converting its instantaneous state to an action at a point in time.
Here we use a Deep Deterministic Policy Gradient (DDPG) RL algorithm to train
the COEX Clover quadcopter system to perform autonomous navigation. With the
advent of solid state lasers, miniaturized optical ranging systems have become
ubiquitous for aerial robotics because of their low power and accuracy. By
equipping the Clover with ten Time of Flight (ToF) ranging sensors, we supply
continuous spatial data in combination with inertial data to determine the
quadcopter's state, which is then mapped to its control output. Our results
suggest that, while the DDPG algorithm is capable of training a quadcopter
system for autonomous navigation, its computation-heavy nature leads to delayed
convergence, and relying on discretized algorithms may permit more rapid
convergence across episodes.
12 changes: 11 additions & 1 deletion thesis.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@ using simpler, more economically affordable sensors can enable a quadcopter to
fly in a GPS-denied environment without the use of LiDAR, which is typically an
order of magnitude more expensive.

<!-- FIXME: rewrite in past tense -->

## Ethical Implications

### Civilian Use
Expand Down Expand Up @@ -424,7 +426,7 @@ a case for the expected adaptability of a DDPG algorithm in curriculum learning.
Because both PPO and DDPG are model-free algorithms with continuous state and
action spaces, we expect similar levels of aptness for curriculum learning.

# Method of approach
# Method of Approach

This project uses the Copter Express (COEX) Clover quadcopter platform, equipped
with Time of Flight (ToF) ranging sensors, and applies a Deep Deterministic
Expand Down Expand Up @@ -944,6 +946,10 @@ the number of episodes increases.
## Theory
<!--
FIXME: restructure by breaking into "algorithmic" and "sensor" subsections
-->
### Deep Reinforcement Learning
As stated, this project uses a Deep RL algorithm known as the Deep Deterministic
Expand Down Expand Up @@ -1647,6 +1653,10 @@ order of days or weeks.
![Episodic duration versus episode number for run 1, whose training results are displayed in {+@fig:plot1}.](images/plots/plot-episode-duration.png){#fig:plot-episode-duration width=100%}
### Threats to Validity
<!-- FIXME: add threats to validity section -->
# Future Work
The results of this project suggest the need for more extensive training using
Expand Down

0 comments on commit 77eec77

Please sign in to comment.