This is a paper that is work in progresss!. This is my own personal notes that hopefully will turn into somekind of white paper on how you most likely can scale you XMPP applications, depending on your needs, context etc. Right now they are loose notes and thoughts that should NOT be used for scaling your environment.
- Put initial paper on git for versioning.
- Create table of content.
- Define large scale operations.
- Table of XMPP server features
- ...
- ...
- Proof read
- "Release" first draft
- Steffen Larsen,
Intro...
First of all, there is NO turnkey solutions that scale linearly! Each project and application will have its own optimization and solution space. Scaling XMPP will take a lot of skills.
My view is that to get the best performance, you have to know and observe how the system behave in real world situation, for the specific use case. XMPP is a large protocol, especially with tons of XMPP Extension Protocols that have been added over time (at present time > 340). If you want to scale you have to have a perfect knowledge of your XMPP server inside and out, but also a perfect knowledge of the XMPP protocol itself. Some requirements or suggested approach in the protocol do not scale out of the specification, and you have to take into account a full solution design, from the client behaviour itself to the cluster architecture and code optimizations.
Large scale in terms of:
- Registered users
- Simultaneous users
- Throughput of messages
many users needs automation of jobs for maintaining the server
- uptime!
- scale
- deployable and easy to manage and upgrade (live)!
restarting server -- storm of reconnects.. make sure that the server can throttle the dufferent types of connections
- besides hitting your own servers it will generate presence for S2S connections!
-
what stanzas will your app send the most?
- presence
- iq (buddy list) server should support roster versioning
- message
-
which transport will the client use? (TCP, BOSH, Websockets)
-
client mobility? (is it a mobile app?.. network changes might have something to say)
-
components for business logic.. remember to scale those as well
-
muc
-
pub/sub
-
off line storage (linmit the size.. it could grow enormously!)
-
is your XMPP public for registering for users? abuse/attacks easier
-
is your XMPP public and have anonymous logins?
-
is your XMPP public but not open for registering?
-
is your XMPP private in a silo and in a controlled environment?
Good to choose a server that is written in a language that you or your development team understands. Getting a stacktrace from Erlang if you do not understand erlang can be a pain. The same with Java and stack traces etc.
-
many servers supports turning off features such as ...
-
JVM tuning if your server runs on a JVM
-
EVM (erlang)
-
Physical vs Hosted / cloud solutions
-
RAM in host.. depends on the number of connections, their buddy list etc. that are held in memory
XEP-198 5. Resumption
It can happen that an XML stream is terminated unexpectedly (e.g., because of network outages). In this case, it is desirable to quickly resume the former stream rather than complete the tedious process of stream establishment, roster retrieval, and presence broadcast.
both c2s and c2s
- memory
- sockets pr process
- TCP/IP stack optimizing
- Load balancer in front (if using multiple endpoints (connection managers))
- split up XMPP server and BOSH/websocket frontend (seperate conenction managers)
- stream compression (zlib/EXI) for constrained network (might take a lot of CPU on client)
Not all supports clustering..
latency: While the BOSH draft document claims very low-latency, it will be difficult for BOSH to compete with WebSockets. Unless you have ideal conditions where HTTP/1.1 is supported all the way through all intermediaries and by the target server, the BOSH client and connection manager will need to re-establish connections after every packet and every request timeout. This will significantly increase latency and latency jitter. Low jitter is often more important for real-time applications than average latency. WebSocket connections will be very similar in latency and jitter to raw TCP connections.
small-packet overhead: In WebSockets there are two bytes of framing overhead for small messages. In BOSH, every message has HTTP request and response headers (easily 180+ bytes for each round-trip). In addition, each message is wrapped in XML (supposedly optional but the spec doesn't define how) with several session related attributes.
complexity: while BOSH uses existing mechanisms in the browser, it requires a moderately complex JavaScript library to implement the BOSH semantics. Managing this in Javascript will also increase latency and jitter compared to a native/browser (or even Flash) implementation.
traction: BOSH started life as a way to make XMPP more efficient. It grew up out of the XMPP community and from what I can tell has gotten very little traction outside of that community. The draft documents for BOSH and XMPP are split apart, but there seems to be very little real world use of BOSH without XMPP.
problem on server: double connection on server..websockets only one.
Fallback mechanisms.. when and where?
Overall: http://xmpp.org/extensions/xep-0286.html
robustness:
- session management (xep-0198)
- message carbons
optimization:
- compression (http://xmpp.org/extensions/xep-0138.html)
- queue presence on server side (http://xmpp.org/extensions/xep-0273.html?)
- roster versioning - to avoid presence storm!
These cover a variety of internal XMPP server measures, including times to perform key functions, message counts, queue sizes, memory consumption, etc.
Tsung -... load test what your app actually does. proxy it and replay the session multiple times..
Monitor for detecting troubles of performance and eventual attacks.
- Large bandwidth consumption
- Many packets of stanzas of a special kind, could be an attack or a error
- Patterns of traffic ... limit those through throtteling
- attacks might slow down your server and stop new users for connecting
- abuse
Comparison are made with the latest versions of the servers.
Server | WebSocket | XEP-0198 | XEP-0273 | Roster Versioning | Clustering | Compress | BOSH | Language | Maturity |
---|---|---|---|---|---|---|---|---|---|
Prosody | X | X | X | - | X | Lua / C | |||
Tigase | X | (X) | X | X | X | Java | |||
Ejabberd | - | ? | X | X | X | Erlang | |||
Mongoose IM | X | ? | X | X | X | Erlang | |||
jabberd | - | ? | ? | - | ? | C | |||
Openfire | X | ? | X | X (Hazelcast) | X | C |