This version of scalelite is designed to run in high available kubernetes clusters. Currently we have the same features as upstream version v1.0.8.
- you need just to mount the same NFS share to
/var/bigluebutton/published
and/var/bigbluebutton/playback
on each of your BBB Servers - no need to additional post publish jobs
- in combination with https://github.com/MBcom/enforce-authentication-greenlight your recordings are protected by Greenlight (LDAP) authentication
- we use the current number of conference participants for calculating server load, instead of the number of running conferences so we are able to balance the load a little bit more granular
BigBlueButton is an open source web conferencing system for online learning.
Scalelite is an open source load balancer that manages a pool of BigBlueButton servers. It makes the pool of servers appear as a single (very scalable) BigBlueButton server. A front-end, such as Moodle or Greenlight, sends standard BigBlueButton API requests to the Scalelite server which, in turn, distributes those request to the least loaded BigBlueButton server in the pool.
A single BigBlueButton server that meets the minimum configuration supports around 200 concurrent users.
For many schools and organizations, the ability to 4 simultaneous classes of 50 users, or 8 simultaneous meetings of 25 users, is enough capacity. However, what if a school wants to support 1,500 users across 50 simultaneous classes? A single BigBlueButton server cannot handle such a load.
With Scalelite, a school can create a pool of 4 BigBlueButton servers and handle 16 simultaneous classes of 50 users. Want to scale higher, add more BigBlueButton servers to the pool.
BigBlueButton has been in development for over 10 years now. The latest release is a pure HTML5 client, with extensive documentation. There is even a BigBlueButton install script called bbb-install.sh that lets you setup a BigBlueButton server (with a Let's Encrypt certificate) in about 15 minutes. Using bbb-install.sh
you can quickly setup a pool of servers for management by Scalelite.
To load balance the pool, Scalelite periodically polls each BigBlueButton to check if it is reachable online, ready to receive API requests, and to determine its current load (number of currently running meetings). With this information, when Scalelite receives an incoming API call to create a new meeting, it places the new meeting on the least loaded server in the pool. In this way, Scalelite can balance the load of meeting requests evenly across the pool.
Many BigBlueButton servers will create many recordings. Scalelite can serve a large set of recordings by consolidating them together, indexing them in a database, and, when receiving an incoming getRecordings, use the database index to return quickly the list of available recordings.
The Scalelite installation process requires advanced technical knowledge. You should, at a minimum, be very familar with
- Setup and administration of a BigBlueButton server
- Setup and administration of a Linux server and using common tools, such as
systemd
, to manage processes on the server - How the BigBlueButton API works with a front-end
- How docker containers work
- How UDP and TCP/IP work together
- How to administrate a Linux Firewall
- How to setup a TURN server
If you are a beginner, you will have a difficult time getting any part of this deployment correct. If you require help, see Getting Help
There are several components required to get Scalelite up and running:
- Multiple BigBlueButton Servers
- Scalelite LoadBalancer Server
- NFS Shared Volume
- PostgreSQL Database
- Redis Cache
- NGINX servers for provoding the recordings to users
An example Scalelite deployment will look like this:
For the Scalelite Server, the minimum recommended server requirements are:
- 4 CPU Cores
- 8 GB Memory
For each BigBlueButton server, the minimum requirements can be found here.
For the external Postgres Database, the minimum recommended server requirements are:
- 2 CPU Cores
- 2 GB Memory
- 20 GB Disk Space (should be good for tens of thousands of recordings)
For the external Redis Cache, the minimum recommended server requirements are:
- 2 CPU Cores
- 0.5GB Memory
- Persistence must be enabled
To setup a pool of BigBlueButton servers (minimum recommended number is 3), we recommend using bbb-install.sh as it can automate the steps to install, configure (with SSL + Let's Encrypt), and update the server when new versions of BigBlueButton are released.
To help users who are behind restrictive firewalls to send/receive media (audio, video, and screen share) to your BigBlueButton server, you should setup a TURN server and configure each BigBlueButton server to use it.
Again, bbb-install.sh can automate this process for you.
If you have not already created a namespace or do not want to use the same as for your Greenlight Kubernetes deployment - it is time to create it now.
kubectl create namespace <your scalelite namespace>
Now run the following:
sed -i 's/scalelite-ns/<your scalelite namespace>/g' ./kubernetes/*.yaml
Mount a the same shared folder to all BBB servers on /var/bigluebutton/published
and /var/bigbluebutton/playback
.
Make sure you copy existing data first to the NFS share before you over mount it.
Create now persistent volume in your kubernetes server:
sed -i 's/NAS-IP/<your NAS IP or DNS Name>/g' ./kubernetes/bbb-nas.yaml
sed -i 's/NFS-SHARE-PATH/<your NFS share path>/g' ./kubernetes/bbb-nas.yaml
kubectl apply -f ./kubernetes/bbb-nas.yaml
To create a PostgreSQL Database run the following. Make sure you have helm installed first.
# if you have not done before
# helm repo add bitnami https://charts.bitnami.com/bitnami
sed -i 's/POSTGRES-PASSWORD/<super secure postgres password>/g' ./kubernetes/scalelite-postgresql.values
helm install -n <your scalelite namespace> scalite-postgres -f ./kubernetes/scalelite-postgresql.values bitnami/postgresql
To create the Redis Cache run the following.
sed -i 's/REDIS-PASSWORD/<super secure redis password>/g' ./kubernetes/scalite-redis.values
helm install -n <your scalelite namespace> scalelite-redis -f ./kubernetes/scalite-redis.values bitnami/redis
To deploy the Scalelite containers run the following:
sed -i 's/SECRET-KEY/<super secret key>/g ./kubernetes/*.yaml
sed -i 's/LOADBALANCER-SECRET/<the scaelite API key>/g ./kubernetes/*.yaml
sed -i 's/REDIS-PASSWORD/<super secure redis password>/g ./kubernetes/*.yaml
sed -i 's/POSTGRES-PASSWORD/<super secure postgres password>/g ./kubernetes/*.yaml
sed -i 's/conf.example.com/<your desired URL - maybe your Greenlight URL>/g ./kubernetes/*.yaml
kubectl apply -f ./kubernetes/ingress-video.yaml
kubectl apply -f ./kubernetes/nginx-videos-pdb.yaml
kubectl apply -f ./kubernetes/nginx-videos.yaml
kubectl apply -f ./kubernetes/scalelite-pdb.yaml
kubectl apply -f ./kubernetes/scalelite-service.yaml
kubectl apply -f ./kubernetes/scalelite.yaml
To switch your Front-End application to use Scalelite instead of a single BigBlueButton server, there are 2 changes that need to be made
BigBlueButton server url
should be set to the url of your Scalelite deploymenthttp(s)://<scalelite-hostname>/bigbluebutton/api/
BigBlueButton shared secret
should be set to theLOADBALANCER_SECRET
value that you set in/etc/default/scalelite
URL_HOST
: The hostname that the application API endpoint is accessible from. Used to protect against DNS rebinding attacks. Should be left blank if deploying Scalelite behind a Network Loadbalancer.SECRET_KEY_BASE
: A secret used internally by Rails. Should be unique per deployment. Generate withbundle exec rake secret
oropenssl rand -hex 64
.LOADBALANCER_SECRET
: The shared secret that applications will use when calling BigBlueButton APIs on the load balancer. Generate withopenssl rand -hex 32
LOADBALANCER_SECRETS
: Additional shared secrets, separated by:
. Any of these secrets will work. In an environment where multiple applications need to integrate with a single scalelite server, it may be sensible to give each application its own secret. This way, revoking individual secrets later will not disturb other applications. For working of events likeanalytics-callback
, the bbb-server's secrets should be added here.DATABASE_URL
: URL for connecting to the PostgreSQL database, see the Rails documentation. The URL should be in the form ofpostgresql://username:password@connection_url
. Note that instead of using this environment variable, you can configure the database server inconfig/database.yml
.REDIS_URL
: URL for connecting to the Redis server, see the Redis gem documentation. The URL should be in the form ofredis://username:password@connection_url
. Note that instead of using this environment variable, you can configure the redis server inconfig/redis_store.yml
(see below).
POLL_INTERVAL
: Used by the "poller" image to set the interval at which BigBlueButton servers are polled, in seconds. Defaults to 60.RECORDING_IMPORT_POLL_INTERVAL
: How often to check the recording spool directory for new recordings, in seconds (when running in poll mode). Defaults to 60.
INTERVAL
: Adjust the polling interval (in seconds) for updating server statistics and meeting status. Defaults to 60. Only used by the "poll" task.WEB_CONCURRENCY
: The number of processes for the puma web server to fork. A reasonable value is 2 per CPU thread or 1 per 256MB ram, whichever is lower.RAILS_MAX_THREADS
: The number of threads to run in the Rails process. The number of Redis connections in the pool defaults to match this value. The default is 5, a reasonable value for production.RAILS_ENV
: Eitherdevelopment
,test
, orproduction
. The Docker image defaults toproduction
. Rails defaults todevelopment
.BUILD_NUMBER
: An additional build version to report in the BigBlueButton top-level API endpoint. The Docker image has this preset to a value determined at image build time.RAILS_LOG_TO_STDOUT
: Log to STDOUT instead of a file. Recommended for deployments with a service manager (e.g. systemd) or in Docker. The Docker image sets this by default.RAILS_LOG_LEVEL
: Set log level of production environment (debug, info, warn, error, fatal, unknown). Default isdebug
.REDIS_POOL
: Configure the Redis connection pool size. Defaults toRAILS_MAX_THREADS
.MAX_MEETING_DURATION
: The maximum length of any meeting created on any server. If theduration
is passed as part of the create call, it will only be overwritten if it is greater thanMAX_MEETING_DURATION
.RECORDING_SPOOL_DIR
: Directory where transferred recording files are placed. Defaults to/var/bigbluebutton/spool
RECORDING_WORK_DIR
: Directory where temporary files from recording transfer/import are extracted. Defaults to/var/bigbluebutton/recording/scalelite
RECORDING_PUBLISH_DIR
: Directory where published recording files are placed to make them available to the web server. Defaults to/var/bigbluebutton/published
RECORDING_UNPUBLISH_DIR
: Directory where unpublished recording files are placed to make them unavailable to the web server. Defaults to/var/bigbluebutton/unpublished
SERVER_HEALTHY_THRESHOLD
: The number of times an offline server needs to responds successfully for it to be considered online. Defaults to 1. If you increase this number, you should decreasePOLL_INTERVAL
SERVER_UNHEALTHY_THRESHOLD
: The number of times an online server needs to responds unsuccessfully for it to be considered offline. Defaults to 2. If you increase this number, you should decreasePOLL_INTERVAL
DB_DISABLED
: Disable the database by setting this value astrue
.RECORDING_DISABLED
: Disable the recording feature and all its associated api's, by setting this value astrue
.RECORDING_IMPORT_UNPUBLISHED
: Imported recordings can be marked as unpublished by default, by setting this value astrue
. Defaults tofalse
.GET_MEETINGS_API_DISABLED
: Disable GET_MEETINGS API by setting this value astrue
.POLLER_THREADS
: The number of threads to run in the poller process. The default is 5. The poller threads should be increased carefully, since higher poller threads can lead to Denial Of Service problems at DNS.CONNECT_TIMEOUT
: The timeout for establishing a network connection to the BigBlueButton server in the load balancer and poller in seconds. Default is 5 seconds. Floating point numbers can be used for timeouts less than 1 second.POLLER_WAIT_TIMEOUT
: The timeout value set for the poller to finish polling a server. Defaults to 10.RESPONSE_TIMEOUT
: The timeout to wait for a response after sending a request to the BigBlueButton server in the load balancer and poller in seconds. Default is 10 seconds. Floating point numbers can be used for timeouts less than 1 second.LOAD_MIN_USER_COUNT
: Minimum user count of a meeting, used for calculating server load. Defaults to 15.LOAD_JOIN_BUFFER_TIME
: The time(in minutes) until theLOAD_MIN_USER_COUNT
will be used for calculating server load. Defaults to 15.SERVER_ID_IS_HOSTNAME
: If set to "true", then instead of generating random UUIDs as the server ID when adding a server Scalelite will use the hostname of the server as the id. Server hostnames will be checked for uniqueness. Defaults to "false".CREATE_EXCLUDE_PARAMS
: List of BBB server attributes that should not be modified by create API call. Should be in the format 'CREATE_EXCLUDE_PARAMS=param1,param2,param3'.JOIN_EXCLUDE_PARAMS
: List of BBB server attributes that should not be modified by join API call. Should be in the format 'JOIN_EXCLUDE_PARAMS=param1,param2,param3'.GET_RECORDINGS_API_FILTERED
: Prevent get_recordings api from returning all recordings when recordID is not specified in the request, by setting value to 'true'. Defaults to false.PREPARED_STATEMENT
: Enable/Disable Active Record prepared statements feature, can be disabled by setting the value asfalse
. Defaults totrue
.DB_CONNECTION_RETRY_COUNT
: The number of times db connection retries will be attempted, in case of a db connection failure. Defaults to3
.RECORDING_PLAYBACK_FORMATS
: Recording playback formats supported by Scalelite, defaults topresentation:video:podcast:notes:capture
.PROTECTED_RECORDINGS_ENABLED
: Applies to the recording import process. If set to "true", then newly imported recordings will have protected links enabled. Default is "false".PROTECTED_RECORDINGS_TOKEN_TIMEOUT
: Protected recording link token timeout in minutes. This is the amount of time that the one-time-use link returned ingetRecordings
calls will be valid for. Defaults to 60 minutes (1 hour).PROTECTED_RECORDINGS_TIMEOUT
: Protected recordings resource access cookie timeout in minutes. This is the amount of time that a user will be granted access to view a recording for after clicking on the one-time-use link. Defaults to 360 minutes (6 hours).
To upgrade you must edit the used Docker images in ./kubernetes/scalelite.yaml
and apply the file with kubectl again after that.
Scalelite comes with a set of commands to
- Add/remove BigBlueButton servers from the pool
- Trigger an immediate poll of all BigBlueButton servers
- Change the state of any BigBlueButton server to being
available
andunavailable
(don't try to put new meetings on the server) - Monitor the load of all BigBlueButton servers
Server management is provided using rake tasks which update server information in Redis.
./bin/rake servers
This will print a summary of details for each server which looks like this:
id: 2d2d674a-c6bb-48f3-8ad4-68f33a80a5b7
url: https://bbb1.example.com/bigbluebutton/api
secret: 2bdce5cbab581f3f20b199b970e53ae3c9d9df6392f79589bd58be020ed14535
enabled
load: 21.0
load multiplier: 2.0
online
Particular information to note:
id
: This is the ID value used when updating or removing the serverenabled
ordisabled
: Whether the server is administratively enabled. See "Enable/Disable servers" below.load
: The number of meetings on the server. New meetings will be scheduled on servers with lower load. Updated by the poll process.online
: Whether the server is responding to API requests. Updated by the poll process.
./bin/rake servers:add[url,secret,loadMultiplier]
The url
value is the complete URL to the BigBlueButton API endpoint of the server. The /api
on the end is required.
You can find the BigBlueButton server's URL and Secret by running bbb-conf --secret
on the BigBlueButton server.
The loadMultiplier
can be used to give individual servers a higher or lower priority over other servers. A higher loadMultiplier should be placed on the weaker servers. If not passed, it defaults to a value of 1
.
This command will print out the ID of the newly created server, and OK
if it was successful.
Note that servers are added in the disabled state; see "Enable a server" below to enable it.
Make sure that there is no space between the parameters [url,secret,loadMultipler] and the comma as it causes a "rake aborted!" error.
./bin/rake servers:remove[id]
Warning: Do not remove a server which has running meetings! This will leave the database in an inconsistent state. You should either wait for all meetings to end, or run the "Panic" function first.
./bin/rake servers:update[id,secret,loadMultiplier]
Updates the secret and load_multiplier for a BigBlueButton server.
The loadMultiplier
can be used to give individual servers a higher or lower priority over other servers. A higher loadMultiplier should be placed on the weaker servers.
After changing the server needs to be polled at least once to see the new load.
./bin/rake servers:disable[id]
Mark the server as disabled.
When a server is disabled, no new meetings will be started on the server.
You will not be able to join existing meetings.
The Poll process does not update disabled servers.
You should not disable a server if it has active load, you can either use the cordon option to drain the server or respond with yes
to clear all meeting state.
./bin/rake servers:enable[id]
Mark the server as enabled.
Note that the server won't be used for new meetings until after the next time the Poll process runs to update the load information.
./bin/rake servers:panic[id]
Disable a server and clear all meeting state. This method is used to recover from a crashed BigBlueButton server. After the meeting state is cleared, anyone who tries to join a meeting that was previously on this server will instead be directed to a new meeting on a different server.
./bin/rake servers:cordon[id]
Mark the server as cordoned.
When a server is cordoned, no new meetings will be started on the server.
Any existing meetings will continue to run until they finish.
The Poll process continues to run on cordoned servers to update the "Online" status and detect ended meetings.
The get_meetings API would also return all the active meetings in the cordoned server.
This is useful to "drain" a server for updates without disrupting any ongoing meetings.
The server state will be updated to disabled
by the poller once the load in server becomes zero or nil.
./bin/rake servers:loadMultiplier[id,newLoadMultiplier]
Sets the load_multiplier for a BigBlueButton server.
The loadMultiplier
can be used to give individual servers a higher or lower priority over other servers. A higher loadMultiplier should be placed on the weaker servers.
After changing the server needs to be polled at least once to see the new load.
./bin/rake poll:all
When you add a server to the pool, it may take upwards of 60 seconds (default value for INTERVAL
for the background server polling process) before Scalelite marks the server as online
.
You can run the above task to have it poll the server right away without waiting.
To list meetings in a specific servers, the following command can be used
./bin/rake servers:meeting_list["serverID1:serverID2:serverID3"]
To list all meetings running across all BigBlueButton servers, use:
./bin/rake servers:meeting_list
./bin/rake servers:addAll[file]
Deprecated: See servers:sync
for a more flexible alternative.
Adds all the servers defined in a YAML file passed as an argument. The file passed in should have the following format:
servers:
- url: "bbb1.example.com"
secret: "1bdce5cbab581f3f20b199b970e53ae3c9d9df6392f79589bd58be020ed14535"
- url: "bbb2.example.com"
secret: "2bdce5cbab581f3f20b199b970e53ae3c9d9df6392f79589bd58be020ed14535"
- url: "bbb3.example.com"
secret: "3bdce5cbab581f3f20b199b970e53ae3c9d9df6392f79589bd58be020ed14535"
The command will print out each added server's url
and id
once it has been successfully added.
Note that all servers are added in the disabled state; see "Enable a server" above to enable them.
./bin/rake servers:sync[path,mode,dryrun]
Add, remove or modify servers according to a YAML configuration file.
The path
parameter should point to a valid YAML configuration file as described
below. Pass -
as the path to read configuration from standard input instead.
You can use the servers:yaml
task to bootstrap a valid configuration file from
an existing scalelite cluster.
The mode
parameter controls how unwanted servers are removed. mode=keep
will
not remove any servers. mode=cordon
(the default) will remove empty servers
and cordon non-empty servers. You may have to repeat the task once these servers
are empty to actually remove them. mode=force
will try to end all meetings on
unwanted servers and then remove them. This works similar to servers:panic[id]
.
If dryrun
is true, the task will run normally but not persist any changes or
end any meetings. This can be used to simulate a sync and see what would happen.
The configuration file should contain a complete list of all servers and follow this structure:
servers:
<server-id>: # must be unique, should be a hostname
secret: <string> # required
url: <string> # default: "https://<server-id>/bigbluebutton/api"
enabled: <bool> # default: true
load_multiplier: <float> # default: 1.0, must be greater than 0
# Example for a simple server with default values
bbb1.example.com:
secret: "1bdce5cbab581f3f20b199b970e53ae3c9d9df6392f79589bd58be020ed14535"
# Full example for a legacy server (generated id)
02bff3a7-c95f-49d3-b1e5-c53eddd4dd68:
secret: "2bdce5cbab581f3f20b199b970e53ae3c9d9df6392f79589bd58be020ed14535"
url: "https://bbb2.example.com/bigbluebutton/api"
enabled: false
load_multiplier: 5.0
The task will try to reach the desired cluster state by adding, removing or modifying servers as needed. To be more exact, the task will:
- Read the configuration file and perform some basic sanity checks.
- Add missing servers, based on server IDs.
- Update configuration for existing servers (
secret
,url
andload_multiplier
). - Cordon servers that are enabled but should be disabled.
- Enable servers that are disabled or cordoned but should be enabled.
- Try to remove servers that are no present in the YAML configuration.
- In
keep
mode, no servers are removed. - In
cordon
mode (default), only empty servers are removed. Non-empty servers are cordoned. - In
force
mode, servers are forcefully evicted and then removed.
- In
./bin/rake servers:yaml[verbose]
Prints a YAML file compatible with servers:sync
. This task can be used to
bootstrap a cluster configuration file from an existing cluster, or get the
current cluster state in a mashine-readable format. If verbose
is true, then
additional fields (state
, load
and online
) are included. These are ignored
by servers:sync
.
./bin/rake status
This will print a table displaying a list of all servers and some basic statistics that can be used for monitoring the overall status of the deployment
HOSTNAME STATE STATUS MEETINGS USERS LARGEST MEETING VIDEOS
bbb1.example.com enabled online 12 25 7 15
bbb2.example.com enabled online 4 14 4 5
To list specific meetings, use:
./bin/rake meetings:list["meetingId1:meetingId2:meetingId3"]
To list all meetings running across all BigBlueButton servers, use:
./bin/rake meetings:list
To End specific meetings, use:
./bin/rake meetings:end["meetingId1:meetingId2:meetingId3"]
To End all meetings running across all BigBlueButton servers, use:
./bin/rake meetings:end
./bin/rake meetings:info[meetingId]
This command will return the following meeting details of a meeting:
Meeting ID: 1a813084f7af08b8d19239315c170b3decedfc03-2-1
Meeting Name: new class
Internal MeetingID: 4445471c7ae2987ddb11db3fa2d89f8c8f86c328-1633448534301
Created Date: Tue Oct 05 15:42:14 UTC 2021
Recording Enabled: true
Server id: bbb.example.com
Serevr url: https://bbb.example.com/bigbluebutton/api/
MetaData:
bbb-context-name: test124
analytics-callback-url: https://bbb1.example.com/bigbluebutton/api/analytics_callback
bbb-recording-tags:
bbb-origin-server-common-name:
bbb-context-label: test
bbb-origin: test
bbb-context: test
bbb-context-id: 2
bbb-recording-name: new class
bbb-origin-server-name: xx.xx.xxx.xx
bbb-recording-description:
bbb-origin-tag: moodle-mod_bigbluebuttonbn
For commercial help with setup and deployment of Scalelite, contact us at Blindside Networks.
This project uses BigBlueButton and is not endorsed or certified by BigBlueButton Inc. BigBlueButton and the BigBlueButton Logo are trademarks of BigBlueButton Inc.
Contributions are welcome to that project.