Deploying OpenAM 12 on Mesos

By: Hanna Ramez / 29 Jun 2016

Background

OpenAM provides open source Authentication, Authorization, Entitlement and Federation software. We chose it as part of our internal SSO platform. The documentation of OpenAM goes into a lot of details about the installation that I don’t need to go through here again.
OpenAM 12 was not designed as a stateless app, and it’s not meant to be running on an elastic environment like Mesos.

The purpose of this doc is to highlight the main pain points and challenges in deploying OpenAM on a Mesos cluster

References

Constraints

  • Mesos cluster only runs stateless non-persistent apps
  • Marathon framework to manage long running apps on Mesos
  • Need to run multiple instances for HA

Setup

1. External config store

Configure OpenAM to use openDJ for storing its configuation.
Use the Openam-configurator tool to push that config file.

By default OpenAM uses its own embeded backend for storing its config and dumping all that on disk in the OpenAM home as seen in the config below BASE_DIR+ The first time to run this tool it will create the BASE_DIR and connect to the backend to write configs.
In subsequent runs of the configurator tool it will connect to the backend and check that there is a config and only write the BASE_DIR part without changing any configs. From here on configs will be handled by sooadm tool or the webUI (console). There are 2 configs in that context

# Server properties
SERVER_URL=http://${HOSTNAME}:${PORT0}
DEPLOYMENT_URI=/openam
BASE_DIR=/tmp/openam
locale=en_US
PLATFORM_LOCALE=en_US
AM_ENC_KEY='some_random_string'
ADMIN_PWD='a_good_pass'
AMLDAPUSERPASSWD='another_good_pass'
COOKIE_DOMAIN=.example.com
ACCEPT_LICENSES=true
# External configuration data store
DATA_STORE=dirServer
DIRECTORY_SSL=SIMPLE
DIRECTORY_SERVER=${OPENDJ_SERVER}
DIRECTORY_PORT=${OPENDJ_PORT}
DIRECTORY_ADMIN_PORT=${OPENDJ_ADMIN_PORT}
ROOT_SUFFIX=$OPENAM_SUFFIX
DS_DIRMGRDN=${OPENDJ_USER}
DS_DIRMGRPASSWD=${OPENDJ_PASS}
# External User store
USERSTORE_TYPE=LDAPv3ForOpenDS
USERSTORE_SSL=SIMPLE
USERSTORE_HOST=${OPENDJ_SERVER}
USERSTORE_PORT=${OPENDJ_PORT}
USERSTORE_SUFFIX=ou=people,dc=example, dc=com
USERSTORE_MGRDN=${OPENDJ_USER}
USERSTORE_PASSWD=${OPENDJ_PASS}
# Site config
LB_SITE_NAME=example
LB_PRIMARY_URL=http://sso.example.com:80/openam
LB_SESSION_HA_SFO=true

Storing config in OpenDJ enables session failover . There is a whole section in the docs on how to prepare the external backend for OpenAM to use it, make sure to follow it correctly .

Clustering

Clustering in OpenAM is a topic that I did not dive deep into, but a few points to note.

  • Every server registers to the cluster with its hostname:port
  • When a server is down it does not leave the cluster or get deleted
  • Sessions are stored in memory on the host, so server down = session lost

2.1. CTS (centralized session store)

CTS is the means for OpenAM to be able to share its sessions among the hosts of the cluster.
It can use OpenDJ to store its sessions, hence allowing a user authenticated on one host to be able to resume on another server.
This is the most important part of the setup

It solves 2 problems

  • session sharing
  • and session persistence across reboots

This is again well documented in the docs, and pay special attention to the preparations on the OpenDJ side.

2.2. Scaling

As mentioned before, OpenAM does not clean its list of hosts in the cluster and as Mesos would launch each time a new instance, the need to clean old instances becomes evident.
There are 2 solutions for that, either in the script that launches OpenAM we catch TERM signals and remove the exiting host form the cluster (which is not so easy), or take a hacky way to ensure instances always have the same name.+ Here is how I achieved that.
This is the JSON file to launch on marathon.

{
"id": "openam",
"cmd":"mkdir jetty && tar xzf jetty-9.3.3.v20150827.tar.gz -C jetty --strip-components=1 && mv openam-server-12.0.0-criteo-15.war /tmp/openam.war && mkdir openam-configurator && unzip openam-configurator-12.0.0.zip -d openam-configurator && bash -x startup.sh",
"cpus": 2,
"mem": 1024,
"env": {
"OPENDJ_SERVER":"opendj.example.com",
"OPENDJ_USER":"cn=admin",
"OPENDJ_PASS":"password",
"OPENAM_SUFFIX":"dc=openam,dc=example,dc=com",
"JAVA_HOME": "/usr/lib/jvm/jre-1.8.0/bin"
},
"instances": 2,
"ports": [
33333
]
"requirePorts": true,
"constraints":[
["hostname","LIKE","slave0[1,2]"]
],
"healthChecks": [{
"path": "/auth/isAlive.jsp",
"protocol": "HTTP",
"portIndex": 0,
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 3,
"ignoreHttp1xx": false
}],
"uris": [
"http://fileserver.example.com/openam-configurator-12.0.0.zip",
"http://fileserver.example.com/jetty.tar.gz",
"http://fileserver.example.com/openam.war",
"http://fileserver.example.com/startup.sh"
]
}
Specify the port number
Enforces the port number defined in 1
Limits the pool to 2 slaves

This way we always have 2 servers slave01:33333 and slave02:33333

3. Starting

Now that OpeAM needs to run its config everytime it starts, I use the startup.sh for that.
startup.sh is a bash script that starts openam on jetty and sends it to background, waits for it to be fully started by looking at the log file for Server:main: Started, then it will launch the openam-configurator tool to configure OpenAM.

# Start jetty
cd $JETTY_HOME
${JAVA_HOME}/java -server -jar start.jar -Djetty.http.port=${PORT0} ${JAVA_OPTS}> /tmp/openam.log 2>&1 &
until [ "$(grep -q 'Server:main: Started @' /tmp/openam.log && echo $?)" == "0" ]
do
sleep 5
done
cd ../

Then after that it writes the config file above and runs the configurator

${JAVA_HOME}/java -jar openam-configurator/openam-configurator-tool-12.0.0.jar -f /tmp/openam.conf
# run again to work around a bug solved in 12.0.1 but we have 12.0.0
if [ "$?" != "0" ]; then
${JAVA_HOME}/java -jar openam-configurator/openam-configurator-tool-12.0.0.jar -f /tmp/openam.conf
[ "$?" != "0" ] && exit 1
fi

then at the end let the script simulate a daemon

while true; do sleep 60; done

This way the script remains running and Mesos keeps the task, I don’t worry about terminating it correctly when Mesos is attempting to kill the task as Mesos will send a TERM to all tasks in that container anyway.

Now OpenAM is ready.

Next step is to configure your SSO solution