For more examples visit – https://github.com/franzinc/agraph-examples Introduction In this document we primarily discuss running a Multi-Master Replication cluster (MMR) inside Kubernetes. We will also show a Docker Swarm implementation. This directory and subdirectories contain code you can use to run an MMR cluster. The second half of this document is entitled Setting up and running MMR under Kubernetes and that is where you’ll see the steps needed to run the MMR cluster in Kubernetes. MMR replication clusters are different from distributed AllegroGraph clusters in these important ways:
MMR replication clusters don’t quite fit the Kubernetes model in these ways
The Design
We’ve had the most experience with Kubernetes on the Google Cloud Platform. There is no requirement that the load balancer support sessions and the GCP version does not at this time, but that doesn’t mean that session support isn’t present in the load balancer in other cloud platforms. Also there is a large community of Kubernetes developers and one may find a load balancer with session support available from a third party. Implementation We build and deploy in three subdirectories. We’ll describe the contents of the directories first and then give step by step instructions on how to use the contents of the directories. Directory ag/ In this directory we build a Docker image holding an installed AllegroGraph. The Dockerfile is
The Dockerfile installs AllegroGraph in We have to worry about the controlling instance process dying and being restarted in another pod with a different IP address. Thus if we’ve cached the DNS mapping of controlling we need to notice as soon as possible that the mapping as changed. The We also install a number of networking tools. AllegroGraph doesn’t need these but if we want to do debugging inside the container they are useful to have installed. The image created by this Dockerfile is pushed to the Docker Hub using an account you’ve specified (see the Directory agrepl/ Next we take the image created above and add the specific code to support replication clusters. The Dockerfile is
When building an image using this Dockerfile you must specify
where MyDockerAccount is a Docker account you’re authorized to push images to. The Dockerfile installs the scripts We modify the Also we know that we’ll be installing a persistent volume at Initially the file When this container starts it runs
In The script
This script can be run under three different conditions
In cases 1 and 2 the environment variable Controlling will have the value “yes”. In case 2 there
will be a directory at In all cases we start an AllegroGraph server. In case 1 we create a new cluster. In case 2 we just sleep and let the AllegroGraph server recover the replication repository and reconnect to the other members of the cluster. In case 3 we wait for the controlling instance’s AllegroGraph to be running. Then we wait for our AllegroGraph server to be running. Then we wait for the replication repository we want to copy to be up and running. At that point we can grow the cluster by copying the cluster repository. We also create a script which will remove this instance from the cluster should this pod be terminated. When the pod is killed (likely due to us scaling down the number of Copy instances) a termination signal will be sent first to the process allowing it to run this remove script before the pod completely disappears. Directory kube/ This directory contains the yaml files that create kubernetes resources which then create pods and start the containers that create the AllegroGraph replication cluster. controlling-service.yaml We begin by defining the services. It may seem logical to define the applications before defining the service to expose the application but it’s the service we create that puts the application’s address in DNS and we want the DNS information to be present as soon as possible after the application starts. In the
This selector defines a service for any container with a label with a key
In fact for all the yaml files shown below you create the object they define by running
copy-service.yaml We do a similar service for all the copy applications.
controlling.yaml This is the most complex resource description for the cluster. We use a StatefulSet so we have a predictable name for the single pod we create. We define two persistent volumes. A StatefulSet is designed to control more than one pod so rather than a VolumeClaim we have a VolumeClaimTemplate so that each Pod can have its own persistent volume… but as it turns out we have only one pod in this set and we never scale up. There must be exactly one controlling instance. We setup a liveness check so that if the AllegroGraph server dies Kubernetes will restart the pod and thus the AllegroGraph server. Because we’ve used a persistent volume for the AllegroGraph repositories when the AllegroGraph server restarts it will find that there is an existing MMR replication repository that was in use when the AllegroGraph server was last running. AllegroGraph will restart that replication repository which will cause that replication instance to reconnect to all the copy instances and become part of the cluster again. We set the environment variable We have a volume mount for
copy.yaml This StatefulSet is responsible for starting all the other instances. It’s much simpler as it doesn’t use Persistent Volumes
controlling-lb.yaml We define a load balancer so applications on the internet outside of our cluster can communicate with the controlling instance. The IP address of the load balancer isn’t specified here. The cloud service provider (i.e. Google Cloud Platform or AWS) will determine an address after a minute or so and will make that value visible if you run
The file is
copy-lb.yaml As noted earlier the load balancer for the copy instances does not support sessions. However you can use the load balancer to issue queries or simple inserts that don’t require a session.
copy-0-lb.yaml If you wish to access one of the copy instances explicitly so that you can create sessions you can create a load balancer which links to just one instance, in this case the first copy instance which is named “copy-0”.
Setting up and running MMR under Kubernetes The code will build and deploy an AllegroGraph MMR cluster in Kubernetes. We’ve tested this in Google Cloud Platform and Amazon Web Service. This code requires Persistent Volumes and load balancers and thus requires a sophisticated platform to run (such as GCP or AWS). Prerequisites In order to use the code supplied you’ll need two additional things
Steps Do Prerequisites Fullfill the prerequisites above Set parameters There are 5 parameters
The first three parameters can be set using the Makefile in
the top level directory. The last two parameters are found in The first three parameters are set via
The account must be specified but the last two can be omitted and default to an AllegroGraph account name of test and a password of xyzzy. If you choose to specify a password make it a simple one consisting of letters and numbers. The password will appear in shell commands and URLs and our simple scripts don’t escape characters that have a special meaning to the shell or URLs. Install AllegroGraph Change to the ag directory and build an image with AllegroGraph installed. Then push it to the Docker Hub
Create cluster-aware AllegroGraph image Add scripts to create an image that will either create an AllegroGraph MMR cluster or join a cluster when started.
Setup a Kubernetes cluster Now everything is ready to run in a Kubernetes cluster. You may already have a Kubernetes cluster running or you may need to create one. Both Google Cloud Platform and AWS have ways of creating a cluster using a web UI or a shell command. When you’ve got your cluster running you can do
and you’ll see your nodes listed. Once this works you can move into the next step. Run an AllegroGraph MMR cluster Starting the MMR cluster involves setting up a number of services and deploying pods. The Makefile will do that for you.
You’ll see when it displays the services that there isn’t an external IP address allocated for the load balancers It can take a few minutes for an external IP address to be allocated and the load balancers setup so keep running
until you see an IP address given, and even then it may not work for a minute or two after that for the connection to be made. Verify that the MMR cluster is running You can use AllegroGraph Webview to see if the MMR cluster is running. Once you have an external IP address for the controlling-load-balancer go to this address in a web browser
Login with the credentials you used when you created the Docker images (the default is user test and
password xyzzy). You’ll see a repository
Click on that link and you’ll see a table of three instances which now serve the same repository. This verifies that three pods started up and all linked to each other. Namespaces All objects created in Kubernetes have a name that is chosen either by the user or Kubernetes based on a name given by the user. Most names have an associated namespace. The combination of namespace and name must be unique among all objects in a Kubernetes cluster. The reason for having a namespace is that it prevents name clashes between multiple projects running in the same cluster that both choose to use the same name for an object. The default namespace is named Another big advantage using namespaces is that if you delete a namespace you delete all objects whose name is in that namespace. This is useful because a project in Kubernetes uses a lot of different types of objects and if you want to delete all the objects you’ve added to a Kubernetes cluster it can take a while to find all the objects by type and then delete them. However if you put all the objects in one namespace then you need only delete the namespace and you’re done. In the Makefile we have this line
which is used by this rule
The reset rule deletes all members of the Namespace named at the top of
the Makefile (here We include this in the Makefile because you may find it useful. Docker Swarm The focus of this document is Kubernetes but we also have a Docker Swarm implementation of an AllegroGraph MMR cluster. Docker Swarm is significantly simpler to setup and manage than Kubernetes but has far fewer bells and whistles. Once you’ve gotten the ag and agrepl images built and pushed to the Docker Hub you need only link a set of machines running Docker together into a Docker Swarm and then
and the AllegroGraph MMR cluster is running Once it is running you can access the cluster using Webview at |