Skip to main content

Raymii.org Raymii.org Logo

Quis custodiet ipsos custodes?
Home | About | All pages | Cluster Status | RSS Feed

Set up a federated XMPP Chat Network with ejabberd, your own Google Talk Hangouts alternative

Published: 11-06-2013 | Author: Remy van Elst | Text only version of this article


❗ This post is over eleven years old. It may no longer be up to date. Opinions may have changed.


This tutorial shows you how to set up your own federated chat network using ejabberd. It covers a basic single node ejabberd server and also the setup of an ejabberd cluster, including errors and DNS SRV record examples. Last but not least federation is also covered.

Recently I removed all Google Ads from this site due to their invasive tracking, as well as Google Analytics. Please, if you found this content useful, consider a small donation using any of the options below:

I'm developing an open source monitoring app called Leaf Node Monitoring, for windows, linux & android. Go check it out!

Consider sponsoring me on Github. It means the world to me if you show your appreciation and you'll help pay the server costs.

You can also sponsor me by getting a Digital Ocean VPS. With this referral link you'll get $200 credit for 60 days. Spend $25 after your credit expires and I'll get $25!

Why set up your own XMPP server

There are a few reasons to set up your own XMPP server.

You might use Google Talk or as it now is named Hangouts. Google's service recently changed and it is going to drop XMPP compatibility. If you have non- gmail chat contacts you can keep chatting to them. And still use an open protocol which is widely supported, not being locked in to google specific software and hardware.

Or you might want to have more control over the logging of your data. Turn of ejabberd logging and use Off The Record which gives you full privacy (and perfect forward secrecy).

You might want to use awesome multi-account chatting applications like Pidgin, Psi+, Empathy, Adium, iChat/Messages or Miranda IM. And on Android you can use Xabber, Beem or OneTeam. Did you know that big players like Facebook, WhatsApp and Google (used) to use XMPP as their primary chat protocol?

Or you might be a sysadmin in need of an internal chat solution. I've got a ejabberd cluster running for a client consisting of 4 Debian 7 VM's (2GB RAM each) spread over 3 sites and 1 datacenter, serving 12000 total users and most of the time 6000 concurrently.

XMPP is an awesome and extendible protocol, on which you can find more here: https://en.wikipedia.org/wiki/XMPP

Information

This setup is tested on Debian 7, Ubuntu 12.04 and 10.04 and OS X 10.8 Server, all running ejabberd installed via the package manager, either apt or ports. It also works on Windows Server 2012 with the ejabberd compiled from the erlang source but that is not covered in this tutorial.

This tutorial uses the example.org domain as the chat domain, and the server chat.example.org as the xmpp server domain. For the clustering part the servers srv1.example.org and srv2.example.org are used. Replace these values for your setup.

Single node / master node ejabberd installation

If you want to set up a single node installation of ejabberd, e.g. no clustering, then follow only this part and the DNS part of the tutorial. If you want to set up a cluster, then also follow this part and continue with the next part.

Installing Ejabberd

This is simple, use your package manager to install ejabberd:

apt-get install ejabberd

You will also install a few dependencies for the erlang runtime.

Configuring ejabberd

We are going to configure the ejabberd service. First stop it:

/etc/init.d/ejabberd stop

Now use your favorite text editor to edit the config files. The ejabberd config is erlang config, so comments are not # but %%. Also, every config option ends with a dot (.).

vim /etc/ejabberd/ejabberd.cfg

First we are going to add our chat domain name:

{hosts, ["example.org"]}.

If you want more domains then you add them as shown below:

{hosts, ["sparklingclouds.nl", "raymii.org", "sparklingnetwork.nl"]}.

This domain name is not the name of the servers you are adding.

Next we define an admin user:

{acl, admin, {user, "remy", "example.org"}}.

remy corresponds with the part before the @ in the XMPP ID, and example.org with the part after. If you need more admin users, add another ACL line.

Now if you want people to be able to register via their XMPP client enable in band registration:

{access, register, [{allow, all}]}.

If you are using MySQL or LDAP authentication then you wouldn't enable this.

I like to have a shared roster with roster groups, and some clients of mine use a shared roster with everybody so that nobody has to add contacts but they see all online users, enable the mod shared roster:

%% Do this in the modules block
  {mod_shared_roster,[]},

If you are pleased with the config file, save it and restart ejabberd:

/etc/init.d/ejabberd restart

We now need to register a user to test our setup. If you've enabled in-band registration you can use your XMPP client, and if you did not enable in-band registration you can use the ejabberdctl command:

ejabberdctl register remy example.org 'passw0rd'

Now test it using an XMPP client like Pidgin, Psi+ or Empathy. If you can connect, then you can continue with the tutorial. If you cannot connect, check your ejabberd logs, firewall setting and such to troubleshoot it.

Clustering ejabberd

Note that you have to have a correctly working master node to continue with the ejabberd clustering. If your master node is not working then fix that first.

Important: the modules you use should be the same on every cluster node. If you use LDAP/MySQL authentication, or a shared_roster, or special MUC settings, or offline messaging, for the clustering this does not matter as long as it is on all nodes.

So lets get started. We are first going to configure the master node, and then the slave nodes.

Prepare the master node

Stop the ejabberd server on the master and edit the /etc/default/ejabberd file:

vim /etc/default/ejabberd

Uncomment the hostname option and change it to a FQDN hostname:

ERLANG_NODE=ejabberd@srv1.example.org

And add the external (public) IP addres as a tuple (no dots but comma's):

INET_DIST_INTERFACE={20,30,10,5}

If you use ejabberd internally then use the primary NIC address.

We are going to remove all the mnesia tables. They will be rebuilt with an ejabberd restart. This is way easier then changing the mnesia data itself. Don't do this on a already configured node without backing up the erlang cookie.

First backup the erlang cookie:

cp /var/lib/ejabberd/.erlang.cookie ~/

Then remove the mnesia database:

rm /var/lib/ejabberd/*

And restore the erlang cookie:

cp ~/.erlang.cookie /var/lib/ejabberd/.erlang.cookie

To make sure all erlang processes are stopped kill all processes from the ejabberd user. This is not needed but the epmd supervisor process might still be running:

killall -u ejabberd

And start ejabberd again:

/etc/init.d/ejabberd start 

If you can still connect and chat, then continue with the next part, configuring the slave nodes.

Prepare the slave nodes

*A slave node should first be configured and working as described in the first part of this tutorial. You can copy the config files from the master node. *

Stop the ejabberd server:

/etc/init.d/ejabberd stop

Stop the ejabberd server on the master and edit the /etc/default/ejabberd file:

vim /etc/default/ejabberd

Uncomment the hostname option and change it to a FQDN hostname:

ERLANG_NODE=ejabberd@srv2.example.org

And add the external (public) IP addres as a tuple (no dots but comma's):

INET_DIST_INTERFACE={30,40,20,6}

If you use ejabberd internally then use the primary NIC address.

Now remove all the mnesia tables:

rm /var/lib/ejabberd/*

Copy the cookie from the ejabberd master node, either by cat and vim or via scp:

# On the master node
cat /var/lib/ejabberd/.erlang.cookie
HFHHGYYEHF362GG1GF

# On the slave node
echo "HFHHGYYEHF362GG1GF" > /var/lib/ejabberd/.erlang.cookie
chown ejabberd:ejabberd /var/lib/ejabberd/.erlang.cookie

We are now going to add and compile an erlang module, the easy_cluster module. This is a very small module which adds an erlang shell command to make the cluster addition easier. You can also execute the commands in the erlang functions itself on an erlang debug shell, but I find this easier and it gives less errors:

vim /usr/lib/ejabberd/ebin/easy_cluster.erl

Add the following contents:

-module(easy_cluster).

-export([test_node/1,join/1]).

test_node(MasterNode) ->
    case net_adm:ping(MasterNode) of 'pong' ->
        io:format("server is reachable.~n");
    _ ->
        io:format("server could NOT be reached.~n")
    end.

join(MasterNode) ->
    application:stop(ejabberd),
    mnesia:stop(),
    mnesia:delete_schema([node()]),
    mnesia:start(),
    mnesia:change_config(extra_db_nodes, [MasterNode]),
    mnesia:change_table_copy_type(schema, node(), disc_copies),
    application:start(ejabberd).

Save it and compile it into a working erlang module:

cd /usr/lib/ejabberd/ebin/
erlc easy_cluster.erl

Now check if it succeeded:

ls | grep easy_cluster.beam

If you see the file it worked. You can find more info on the module here: https://github.com/chadillac/ejabberd-easy_cluster/

We are now going to join the cluster node to the master node. Make sure the master is working and running. Also make sure the erlang cookies are synchronized.

On the slave node, start an ejabberd live shell:

/etc/init.d/ejabberd live

This will start an erlang shell and it will give some output. If it stops outputting then you can press ENTER to get a prompt. Enter the following command to test if the master node can be reached:

easy_cluster:test_node('ejabberd@srv1.example.org').

You should get the following response: server is reachable. If so, continue.

Enter the following command to actually join the node:

easy_cluster:join('ejabberd@srv1.example.org').

Here's example output from a successful test and join join:

/etc/init.d/ejabberd live
*******************************************************
* To quit, press Ctrl-g then enter q and press Return *
*******************************************************

Erlang R15B01 (erts-5.9.1) [source] [async-threads:0] [kernel-poll:false]

Eshell V5.9.1  (abort with ^G)

=INFO REPORT==== 10-Jun-2013::20:38:15 ===
I(<0.39.0>:cyrsasl_digest:44) : FQDN used to check DIGEST-MD5 SASL authentication: "srv2.example.org"

=INFO REPORT==== 10-Jun-2013::20:38:15 ===
I(<0.576.0>:ejabberd_listener:166) : Reusing listening port for 5222

=INFO REPORT==== 10-Jun-2013::20:38:15 ===
I(<0.577.0>:ejabberd_listener:166) : Reusing listening port for 5269

=INFO REPORT==== 10-Jun-2013::20:38:15 ===
I(<0.578.0>:ejabberd_listener:166) : Reusing listening port for 5280

=INFO REPORT==== 10-Jun-2013::20:38:15 ===
I(<0.39.0>:ejabberd_app:72) : ejabberd 2.1.10 is started in the node 'ejabberd@srv2.example.org'
easy_cluster:test_node('ejabberd@srv1.example.org').
server is reachable.
ok
(ejabberd@srv2.example.org)2> easy_cluster:join('ejabberd@srv1.example.org').

=INFO REPORT==== 10-Jun-2013::20:38:51 ===
I(<0.39.0>:ejabberd_app:89) : ejabberd 2.1.10 is stopped in the node 'ejabberd@srv2.example.org'

=INFO REPORT==== 10-Jun-2013::20:38:51 ===
    application: ejabberd
    exited: stopped
    type: temporary

=INFO REPORT==== 10-Jun-2013::20:38:51 ===
    application: mnesia
    exited: stopped
    type: permanent

=INFO REPORT==== 10-Jun-2013::20:38:52 ===
I(<0.628.0>:cyrsasl_digest:44) : FQDN used to check DIGEST-MD5 SASL authentication: "srv2.example.org"

=INFO REPORT==== 10-Jun-2013::20:38:53 ===
I(<0.1026.0>:ejabberd_listener:166) : Reusing listening port for 5222

=INFO REPORT==== 10-Jun-2013::20:38:53 ===
I(<0.1027.0>:ejabberd_listener:166) : Reusing listening port for 5269

=INFO REPORT==== 10-Jun-2013::20:38:53 ===
I(<0.1028.0>:ejabberd_listener:166) : Reusing listening port for 5280
ok
(ejabberd@srv2.example.org)3>
=INFO REPORT==== 10-Jun-2013::20:38:53 ===
I(<0.628.0>:ejabberd_app:72) : ejabberd 2.1.10 is started in the node 'ejabberd@srv2.example.org'

Exit your erlang shell by pressing CTRL+C twice. Now stop ejabberd and start it again:

/etc/init.d/ejabberd restart

You can now check in the admin webinterface if the cluster join succeeded:

http://srv1.example.org:5280/admin/nodes/

Ejabberd nodes

If it shows the other node you are finished. If not, see if the steps worked and check the below section on troubleshooting.

Repeat the above steps for every node you want to add. You can add as many nodes as you want.

Errors when clustering

When setting up your cluster you might run into errors. Below are my notes for the errors I found.

  • ejabberd restart does not restart epmd (erlang daemon)

    • overkill solution: killall -u ejabberd
  • ejabberd gives hostname errors

    • make sure the hostname is set correctly (hostname srv1.example.com)
  • ejabberd gives inconsistent database errors

    • backup the erlang cookie (/var/lib/ejabberd/.erlang.cookie) and then remove the contents of the /var/lib/ejabberd folder so that mnesia rebuilds its tables.
  • ejabberd reports "Connection attempt from disallowed node"

    • make sure the erlang cookie is correct (/var/lib/ejabberd/.erlang.cookie). Set vim in insert mode before pasting...

DNS SRV Records and Federation

The DNS SRV Record is used both by chat clients to find the right server address as well as by other XMPP servers for federation. Example: Alice configures her XMPP clients with the email address alice@example.org. Her chat client looks up the SRV record and knows the chat server to connect to is chat.example.org. Bob sets up his client with the address bob@bobsbussiness.com, and adds Alice as a contact. The XMPP server at bobsbussiness.com looks up the SRV record and knows that it should initiate a server2server connection to chat.example.org to federate and let Bob connect with Alice.

The BIND 9 config looks like this:

; XMPP
_xmpp-client._tcp                       IN SRV 5 0 5222 chat.example.org.
_xmpp-server._tcp                       IN SRV 5 0 5269 chat.example.org.
_jabber._tcp                            IN SRV 5 0 5269 chat.example.org.

It is your basic SRV record, both the client port and the server2server port, and legacy Jabber. If you have hosted DNS then either enter it in your panel or consult your service provider.

You can use the following dig query to verify your SRV records:

dig _xmpp-client._tcp.example.org SRV
dig _xmpp-server._tcp.example.org SRV

Or if you are on Windows and have to use nslookup:

nslookup -querytype=SRV _xmpp-client._tcp.example.org
nslookup -querytype=SRV _xmpp-server._tcp.example.org

If you get a result like this then you are set up correctly:

;; QUESTION SECTION:
;_xmpp-client._tcp.raymii.org.  IN      SRV

;; ANSWER SECTION:
_xmpp-client._tcp.raymii.org. 3600 IN   SRV     5 0 5222 chat.raymii.org.

The actual record for chat.raymii.org in my case are multiple A records:

;; ADDITIONAL SECTION:
chat.raymii.org.        3600    IN      A       84.200.77.167
chat.raymii.org.        3600    IN      A       205.185.117.74
chat.raymii.org.        3600    IN      A       205.185.124.11

But if you run a single node this can also be a CNAME or just one A/AAAA record.

Final testing

To test if it all worked you can add the Duck Duck Go XMPP bot. If this works flawlessly and you can add it and chat to it, then you have done everything correctly. The email address to add is im@ddg.gg.

Tags: chat , dns , ejabberd , erlang , federation , google-talk , jabber , tutorials , xmpp