Skip to content

Load balancing JBoss Data Virtualization with HAProxy

License

Notifications You must be signed in to change notification settings

foogaro/jdv-play

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Incipit

Load balancing JBoss Data Virtualization using HAProxy. The environment is totally based on Docker. Everything can be scripted and automated.

Terminology

First things first.

It's important that we all speak the same language, or at least we mean the same thing.

Terms Descritpion
RHEL It stands for Red Hat Enterprise Linux, it is a Linux distribution developed and mantained by Red Hat.
DBMS It stands for Database Management System, it is a computer program designed to manage a database, a large set of structured data, and run operations on the data requested by numerous users.
RDBMS It stands for Relational Database Management System, it is a database management system (DBMS) that is based on the relational model.
API It stands for Application Programming Interface, it is a set of subroutine definitions, protocols, and tools for building application software.
JDBC It stands for Java Database Connectivity, it is an application programming interface (API) for the Java programming language, which defines how a client can access a database.
ODBC It stands for Open Database Connectivity, it is a standard application programming interface (API) for accessing database management systems (DBMS)
TCP It stands for Transmission** Control Protocol, it is one of the main protocols of the Internet protocol suite.
HTTP It stands for Hypertext Transfer Protocol, it is an application protocol for distributed, collaborative, and hypermedia information systems.
REST It stands for Representational State Transfer (also known as RESTful Web services), it is one way of providing interoperability between computer systems on the Internet.
SOAP It stands for Simple Object Access Protocol, it is a protocol specification for exchanging structured information in the implementation of web services in computer networks.
SQL It stands for Structure Query Language, it is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS).
DDL It stands for Data Definition Language, it is used to define the database structure or schema.
DML It stands for Data Manipulation Language, it is used for managing data within schema objects.
NOSQL It stands for Not Only SQL, it is a kind of a database, which provides a mechanism for storage and retrieval of data which is modeled in means other than the tabular relations used in relational databases.
JDV It stands for JBoss Data Virtualization. See more
DV It stands for Data Virtualization, it is used as synonymous of JDV.
VDB It stands for Virtual Database, it is the deployment unit of a JDV project.
IDE It stands for Integrated Development Environment, it is a software application that provides comprehensive facilities to computer programmers for software development.
JBDS It stands for JBoss Developer Studio, it is a development environment created and currently developed by Red Hat.
Image It's meant to be the software stack, that will compose a runtime environment. In our scenario, it's meant to be a Docker image.
Container It's meant to be a Linux container, based on a spefici image. In our scenario a Docker container.
Host It's meant to be the server running this repo, thus hosting Docker's image repository.
Guest It's deprecated. In case of use, refer to it as container, actually a named container.
HA It stands for High Availability, it is a characteristic of a system, which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.
IHIH It stands for I Hope It Helps, ça va sans dire.

Overview

JBoss Data Virtualization is a data abstraction middleware layer. Generally speaking it belongs to Big Data in terms of data integration. With JDV you can model, compose and integrate your data domain objects by importing database of any kind (RDBMS, NoSQL, spread sheets, hadoop, SAP, text files, etc...), and have them exposed in the way is more comfortable to you (meant as client applicaiton) to read, either in terms of structure, either in terms of access protocols.

JDV supports:

  • JDBC;
  • ODBC;
  • REST;
  • SOAP;
  • OData v4 (also partially OData v2).

In this repository you will find a MySQL's Docker files along with its DDL and DML files (mysql folder), a JDV project (jdv-play folder) based on JBoss Developer Studio with Teeid's plugins, JDV's Docker files (jdv folder), and HAProxy's Docker files (haproxy folder).

Prerequisites

You need a "couple" of things:

  • internet connection;
  • docker installed;
  • RHEL subscription to use;
  • JDV installer;
  • enough disk space (you will not believe how docker enlarge itself);
  • git clone this repo.

Docker networking (just a bit)

The base of linux containers is isoltation. When running docker containers (mind the "s") you probably want some of those containers interact with others, and some of them not to interact with others. Docker helps you achieve this by creating subnet. Along with all jdv-play we will use a network called jdv.

Just follow those easy steps:

docker network create jdv
8af8d25ea76210b473ab535750a56560f41ed5f249f858b3cb8b89fb75546837
docker network inspect jdv
[
    {
        "Name": "jdv",
        "Id": "8af8d25ea76210b473ab535750a56560f41ed5f249f858b3cb8b89fb75546837",
        "Created": "2017-05-19T12:30:35.841439238Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "Gateway": "172.18.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Containers": {},
        "Options": {},
        "Labels": {}
    }
]

Build the RHEL base image

All Docker images belong to this project are based on RHEL, so we first need to build tha base image. In the folder "rhel", do edit the file named rhel.dockerfile and use your own credential to register to the Red Hat Network. Here is an example:

FROM registry.access.redhat.com/rhel:latest

MAINTAINER Foogaro <l.fugaro@gmail.com>

RUN subscription-manager register --username YOURUSERNAME --password DONOTUSEPASSWORDDOTONE
RUN subscription-manager attach --auto
RUN yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel 
RUN yum install -y which net-tools mlocate
RUN yum -y update && yum clean all

Once done, create your RHEL image by building you dockerfile, as follows:

docker build -f rhel.dockerfile -t foogaro/rhel .

MySQL

First we will create a data container to store our database. Doing so, any update to MySQL will be kept locally on the host machine (that is your laptop running all this). Then, we will run a mysql container linking the data container. Once done, we will run the script to create and populate our database.

Build the data container

docker build -f mysql_data.dockerfile -t foogaro/mysql_data .

The above command is wrapped into the mysql_data.build file.

Create the data container

docker create --name mysql_data foogaro/mysql_data

The above command is wrapped into the mysql_data.create file.

Run MySQL linking the data container

docker run -it --rm=true --name="mysql" --net="jdv" --volumes-from mysql_data -v /jdv-play/mysql/var/lib/mysql:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=root -p 3306:3306 mysql

The above command is wrapped into the mysql_data.create file.

Create and populate database

It's gonna be tricky (and I'm pretty sure it's because of me)

cp employees.ddl var/lib/mysql/
unzip mysql-load-departments.zip -d var/lib/mysql/
unzip mysql-load-dept-emp.zip -d var/lib/mysql/
unzip mysql-load-dept-manager.zip -d var/lib/mysql/
unzip mysql-load-employees.zip -d var/lib/mysql/
unzip mysql-load-salaries.zip -d var/lib/mysql/
unzip mysql-load-titles.zip -d var/lib/mysql/

The above commands are wrapped into the mysql_data.load file.

It will just place DML files into the shared volume, so you can have the loading phase within the mysql container as follows:

docker exec -ti mysql /bin/bash

Now inside the mysql container execute the following:

mysql -u root -proot < /var/lib/mysql/employees.ddl
mysql -u root -proot employees < /var/lib/mysql/mysql-load-departments.dml
mysql -u root -proot employees < /var/lib/mysql/mysql-load-dept-emp.dml
mysql -u root -proot employees < /var/lib/mysql/mysql-load-dept-manager.dml
mysql -u root -proot employees < /var/lib/mysql/mysql-load-employees.dml
mysql -u root -proot employees < /var/lib/mysql/mysql-load-salaries1.dml
mysql -u root -proot employees < /var/lib/mysql/mysql-load-salaries2.dml
mysql -u root -proot employees < /var/lib/mysql/mysql-load-salaries3.dml
mysql -u root -proot employees < /var/lib/mysql/mysql-load-titles.dml

Unfortunately the following docker command didn't work:

docker exec -ti mysql mysql -u root -proot < /var/lib/mysql/employees.ddl

or

docker exec -ti mysql "mysql -u root -proot < /var/lib/mysql/employees.ddl"

Not even the mysql-command wrapped into a executable file!

I spent an afternoon, and it didn't work!

Please send me a PR with the fix!

Client SQL

Now, if the steps above were successfully, you can have a look at the database with your favorite SQL client, and you should have the following schema:

alt text

And the count for each table should be as follows:

Schema Tables Count
employees departments 9
employees dept_emp 331.603
employees dept_manager 24
employees employees 300.024
employees salaries 2.844.047
employees titles 443.308

JDV

Make sure MySQL is up and running, and let it go. Also, make sure you have placed into the "jdv/software" folder, the following files:

  • jboss-dv-6.3.0-installer.jar
  • jboss-dv-6.3.5-patch.jar
  • jboss-dv-6.3.0-teiid-jdbc.jar
  • jboss-dv-psqlodbc-6.2.0-3.el7.x86_64.rpm
  • mysql-connector-java-5.1.40-bin.jar

Build the image

docker build -f jdv.dockerfile -t foogaro/jdv:6.3.5 .

The above command is wrapped into the jdv.build file.

Run the container "jdv1"

docker run -it --rm="true" --name="jdv1" --link="mysql" --net="jdv" -p 18009:8009 -p 18080:8080 -p 19990:9990 -p 19999:9999 -p 14447:4447 -p 31100:31000 -p 35431:35432 -p 55210:55200/udp -e DOCKER_MYSQL_IP=172.18.0.2 -e DOCKER_MYSQL_PORT=3306 -e DOCKER_MYSQL_DBNAME=employees foogaro/jdv:6.3.5 -b 172.18.0.3 -bmanagement 172.18.0.3 --server-config=standalone-ha.xml -Djboss.node.name=JDV1

The above command is wrapped into the jdv1.run file.

Configure the container "jdv1"

docker exec -t jdv1 /opt/rh/configure.sh

Run the container "jdv2"

docker run -it --rm="true" --name="jdv2" --link="mysql" --net="jdv" -p 28009:8009 -p 28080:8080 -p 29990:9990 -p 29999:9999 -p 24447:4447 -p 31200:31000 -p 35432:35432 -p 55220:55200/udp -e DOCKER_MYSQL_IP=172.18.0.2 -e DOCKER_MYSQL_PORT=3306 -e DOCKER_MYSQL_DBNAME=employees foogaro/jdv:6.3.5 -b 172.18.0.4 -bmanagement 172.18.0.4 --server-config=standalone-ha.xml -Djboss.node.name=JDV2

The above command is wrapped into the jdv1.run file.

Configure the container "jdv2"

docker exec -t jdv2 /opt/rh/configure.sh

JBoss Developer Studio project

I'm not going to describe how to import database metadata, create views and all. In the reposiroty you can find the JBDS project into the workspace folder named ws-jdv-play. Have a look at it, and feel free to change and update whatever you want.

HAProxy

HAProxy is free, open source software that provides a high availability load balancer and proxy server for TCP and HTTP-based applications that spreads requests across multiple servers.

The haproxy.cfg configuration file

global
    log         127.0.0.1 local2
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon
    stats socket /var/lib/haproxy/stats

defaults
    mode                    tcp
    log                     global
    option                  tcplog
    option                  dontlognull
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000


frontend http-proxy
    mode http
    option httplog
    bind *:80
    #bind *:443 ssl crt /etc/pki/haproxy/haproxy.pem no-sslv3
    use_backend http-stats      if { ssl_fc_sni stats.employees.com }
    use_backend http-odata      if { ssl_fc_sni odata.employees.com }
    use_backend http-dashboard  if { ssl_fc_sni dashb.employees.com }
    default_backend http-stats

frontend jdbc-proxy
    mode tcp
    option tcplog
    option tcpka
    option socket-stats
    timeout client 120m
    bind *:31000
    #bind *:31000 ssl crt /etc/pki/haproxy/haproxy.pem
    use_backend jdbc-employees

frontend odbc-proxy
    mode tcp
    option tcplog
    option tcpka
    option socket-stats
    timeout client 120m
    bind *:35432
    use_backend odbc-employees


backend http-stats
    mode http
    stats enable
    stats realm Haproxy\ Statistics
    stats uri /
    stats auth admin:haproxy.2017
    server haproxy 172.18.0.5:80 check

backend http-odata
    mode http
    option httpclose
    balance roundrobin
    server jdv1 172.18.0.3:8080 check
    server jdv2 172.18.0.4:8080 check

backend http-dashboard
    mode http
    option forwardfor
    balance roundrobin
    option httpclose
    acl has_path_dashboard path_beg /dashboard
    http-request redirect code 302 location http://%[hdr(host)]/dashboard/ unless has_path_dashboard
    server jdv1 172.18.0.3:8080 check
    server jdv2 172.18.0.4:8080 check

backend jdbc-employees
    #balance leastconn
    balance roundrobin
    timeout queue 120m
    timeout server 120m
    server jdv1 172.18.0.3:31000 check
    server jdv2 172.18.0.4:31000 check

backend odbc-employees
    #balance leastconn
    balance roundrobin
    timeout queue 120m
    timeout server 120m
    server jdv1 172.18.0.3:35432 check
    server jdv2 172.18.0.4:35432 check

Build the image

docker build -f haproxy.dockerfile -t foogaro/haproxy:1.5.18 .

The above command is wrapped into the haproxy.build file.

Run the container "haproxy"

docker run -ti --rm="true" --name="haproxy" --net="jdv" -p 80:80 -p 31000:31000 -p 35433:35432 foogaro/haproxy:1.5.18

The above command is wrapped into the haproxy.run file.

Status and statistics

You can now access HAProxy's statistics page at the following URL (credentials are specified into the haproxy.cfg file):

http://127.0.0.1

And you should see the following page:

alt text

Conclusion

Everything is up and running.

Check it via Docker as follows:

docker network inspect jdv
[
    {
        "Name": "jdv",
        "Id": "8af8d25ea76210b473ab535750a56560f41ed5f249f858b3cb8b89fb75546837",
        "Created": "2017-05-19T12:30:35.841439238Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "Gateway": "172.18.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Containers": {
            "28cf4be3ddbdd147732db95c2d0a1b6d53bbe0decc89fd07533292f95a1be456": {
                "Name": "jdv1",
                "EndpointID": "029f0aa2bd384b2564fabf53426395dd10a1ff5f09743ca4bf6a8e1d9d8fde61",
                "MacAddress": "02:42:ac:12:00:03",
                "IPv4Address": "172.18.0.3/16",
                "IPv6Address": ""
            },
            "525216a59756266270b9f049fa3ebb7912251012af9fe04a7146f06bd64cc669": {
                "Name": "mysql",
                "EndpointID": "44a7df9ef223915314b01bb848bd3d84bb17a3b281f0f5dac5e55c1aa77fd2cf",
                "MacAddress": "02:42:ac:12:00:02",
                "IPv4Address": "172.18.0.2/16",
                "IPv6Address": ""
            },
            "84eb2e037f633c502e2260124413fa7516d8d28ce1403a367f0fc38135ef970d": {
                "Name": "haproxy",
                "EndpointID": "e1c5a9818bddb9c6956cac05736f8e1eb63a5fc29af039eee929417382a09e7b",
                "MacAddress": "02:42:ac:12:00:05",
                "IPv4Address": "172.18.0.5/16",
                "IPv6Address": ""
            },
            "c8626c54f8f9f872ea7bd963d9babb6b83f68abf59201b6236b5e6649a06423a": {
                "Name": "jdv2",
                "EndpointID": "7734edbee4c58b89795ad593ab7bec013ad63cc1bba42497b2a498e409657b4d",
                "MacAddress": "02:42:ac:12:00:04",
                "IPv4Address": "172.18.0.4/16",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {}
    }
]

You can now access your Virtual Database from just one entrypoint, that is HAProxy, by using the protocol that best fits your needs.

Connect to the "EMP" Virtual database using JDBC, ODBC, or the protocol you prefer, by pointing to the HAProxy's IP and port (JDBC=31000, ODBC=35433, OData=8080).

Now play around... and yes you are allowed to stop one server, start it back and stop the other one, and so on and so on.

IHIH,

Ciao,

Luigi