Feed this web application a file, url, or paste with prose in English and it will provide data from which an assessment can be made, for the purposes of:
-
planning substantive revision
-
setting writing goals or objectives, as case managers might do
-
track a your progress toward a writing objective or accomplishing a challenge
-
or, track the progress of a student toward written langauge goals, or written expression or mechanics objects.
-
Generate speech from text for proofing or an accessibility toool (downloadable as mp3)
Python3, Flask, Apache2, mod_wsgi, stanford_ner, and NLTK are involved. The repo includes a Vagrantfile (contents at the end of this document).
However, the other tools can be used to install on Ubuntu 18.04, or adapted for other releases and distros.
Linux: vagrant box "bento/ubuntu-18.04"
Python: Python3.7.5
According to flask --version
:
Python 3.7.5 Flask 1.1.1 Werkzeug 0.16.0
Apache2
mod_wsgi 7.7.1
File structure, rough deployment script, building mod_wsgi from source, and a sample apache2 virtual server conf for the application are provided below.
demo server at https://opm.gonzotraining.com. interface is a bit different than what's provided here.
Install for your school or test env or home with this repo.
The deployment script assumes /vagrant exists and the user is vagrant.
However, there are vars at the top of the script. Replace /vagrant
with $installer_dir
and set the var to the path to your target directory without a trailing slash (such as /var/www/html/open-prose-metrics
). A simple change is necessary for the alias definition in the virtual host config (below) as well.
/
--/home/$USER/opm/ # app home. No ___init__.py present.
|--wsgi.py # gunicorn wsgi file
|--input/read_document # from opm/app/input/ directory, not opm/opm/input/
--/tmp/
|--gunicorn.sock # created dynamically as needed. Will have permissions from service definition(?)
--/usr/share/
|--stanfor-ner/ # create using sudo as $USER (owner of app);
|--nltk-data/ # create directory, then python -m nltk.downloader -d /usr/share/nltk_data stopwords words punkt brown vader_lexicon averaged_perceptron_tagger maxent_ne_chunker
--/etc/
|--systemd/system/gunicorn.service # specifically for opm
|--/etc/apache2/sites-available/opm.conf #
--/var/
|--log/opm/ # todo: change to opm
|--error.log # for the apache vhost
|--log/extraeyes/
|--app.log # flask output to stdout and stderror when running with flask app in venv
application user: vagrant
#!/bin/bash -ex
INSTALLER_DIR="/vagrant"
TARGET_DIR="/var/www/html/open-prose-metrics"
APP_OWNER="vagrant"
APP_GROUP="www-data"
LOG_DIR="/var/log/opm"
VENV="/var/www/html/open-prose-metrics/virtualenv"
NER_DIR="/usr/share"
NLTK_DATA_DIR="/usr/share/nltk_data"
# dpkg-reconfigure tzdata (automation?)
ln -fs /usr/share/zoneinfo/US/Eastern /etc/localtime && dpkg-reconfigure -f noninteractive tzdata
# Repository - General
echo "Install from Ubuntu Package Manager"
apt update && apt upgrade -y
apt install -y linux-headers-$(uname -r)
apt install -y apache2 python3 python3-dev python3-pip
apt install -y python3.7 python3.7-dev
apt install -y libenchant-dev unzip openjdk-8-jre-headless
# Repository - depends for pycurl
apt install -y libcurl4-openssl-dev libssl-dev
# Repository - for scipy and numpy
apt install -y libatlas-base-dev
# Repository - possibly required by numpy and scipy
apt install -y libpython3.7 libpython3.7-dev gcc gfortran python-dev libopenblas-dev liblapack-dev cython
# Repository - compiling mod-wsgi for python: dependencies
apt install -y apache2-dev
# Stanford NER (Required by nltk for opm NER)
wget https://nlp.stanford.edu/software/stanford-ner-4.0.0.zip
unzip stanford-ner-4.0.0.zip -d $NER_DIR
chown -R $APP_OWNER:$APP_GROUP $NER_DIR
mv $NER_DIR/stanford-ner-4.0.0 $NER_DIR/stanford-ner
# depends for pycurl
apt install -y libcurl4-openssl-dev libssl-dev
# Pip: system-wide dependencies
echo "Get Pip for Python3.7"
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3.7 get-pip.py
pip3.7 install wheel virtualenv ipython
echo "Create VENV"
virtualenv -p python3.7 $VENV
source $VENV/bin/activate
echo "Install from venv pip"
pip install -r /vagrant/opm/requirements.txt
# opm
echo "Copy App Folder"
cp -r /vagrant/opm $TARGET_DIR/opm # move app folder
#Create and set permissions
echo "Create Files and Directories and Set Permissions"
mkdir /var/log/extraeyes /var/log/opm /home/$APP_OWNER/.plotly
touch /var/log/extraeyes/app.log
chown -R $APP_OWNER:$APP_GROUP /var/log/extraeyes
chown -R $APP_OWNER:$APP_GROUP /var/log/opm
chown -R $APP_OWNER:$APP_GROUP /home/$APP_OWNER/.plotly
chmod 775 /home/$APP_OWNER/.plotly
chown -R $APP_OWNER:$APP_GROUP $TARGET_DIR
#nltk_data
echo "Fetching NLTK data"
python -m nltk.downloader -d /usr/share/nltk_data stopwords words punkt brown vader_lexicon averaged_perceptron_tagger maxent_ne_chunker #for ner
echo "Pre-create Postagger (depends on stanford NER)"
chown -R $APP_OWNER:$APP_GROUP $NER_DIR/stanford-ner
cd $TARGET_DIR/opm && python postagger.py
echo "Seeding Database and Testing Backend"
cd $TARGET_DIR/opm && python seed_database.py
echo "Configuring Apache2"
cp $INSTALLER_DIR/opm-configs/etc/apache2/sites-available/*.conf /etc/apache2/sites-available/
sed -i "s|{{ owner }}|$APP_OWNER|" /etc/apache2/sites-available/opm.conf
sed -i "s|{{ group }}|$APP_GROUP|" /etc/apache2/sites-available/opm.conf
a2dissite 000-default
a2ensite opm.conf
apt clean
#systemctl reload apache2
# END - mod_wsgi will be called
echo "Finished. Next, wsgi-install will be called to compile, install mod_wsgi and reload apache2"
cd /vagrant/
unzip /vagrant/4.7.1.zip
cd mod_wsgi-4.7.1/
./configure --with-python=/usr/bin/python3.7
make
make install
#cp /usr/lib/apache2/modules/mod_wsgi.so /etc/apache2/modules/
chmod 644 /usr/lib/apache2/modules/mod_wsgi.so
systemctl reload apache2
LoadModule wsgi_module /usr/lib/apache2/modules/mod_wsgi.so
<VirtualHost *>
ServerName blah.com
ServerAlias opm.blah.com
ServerAlias essayeyes.com
#WSGIDaemonProcess opm user={{ owner }} group={{ group }} threads=5
WSGIDaemonProcess opm user=vagrant group=www-data threads=5 python-path=/var/www/html/open-prose-metrics/opm:/var/www/html/open-prose-metrics/virtualenv/lib/python3.7/site-packages
WSGIScriptAlias / /var/www/html/open-prose-metrics/opm/opm.wsgi
<Directory /var/www/html/open-prose-metrics>
WSGIProcessGroup opm
WSGIApplicationGroup %{GLOBAL}
#Order deny,allow
Allow from all
</Directory>
ErrorLog /var/log/opm/error.log
</VirtualHost>
# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
config.vm.box = "bento/ubuntu-18.04"
config.vm.network "forwarded_port", guest: 80, host: 8081
config.vm.network "forwarded_port", guest: 5000, host: 5000
#config.vm.provision "file", source: "opm", destination: "~/opm"
#config.vm.provision "file", source: "flask-nginx-gunicorn/gunicorn/systemd/gunicorn.service", destination: "/etc/systemd/system/gunicorn.service"
#config.vm.provision "file", source: "flask-nginx-gunicorn/nginx/opm", destination: "/etc/nginx/sites-available/opm"
config.vm.provision "shell", path: "deploy.sh"
config.vm.provision "shell", path: "wsgi-install.sh"
config.vm.provider :virtualbox do |vb|
# # Don't boot with headless mode
# vb.gui = true
#
# # Use VBoxManage to customize the VM. For example to change memory:
vb.customize ["modifyvm", :id, "--memory", "1024", "--cpus", "2"] # once installed, 2048 is ok. One vcpu is file
end
end
© rik goldman, MIT LICENSE
High-school students at Chelsea School in Hyattsville, MD, gave encouragement and supported initial explorations into just how viable this project would be.