I am not one to rant and rave but I seem to have been pushed over the edge this morning, but a large Telco service which leaves a lot to be desired yet despite being innovative seem to be leading more and more wastage in terms of time which would be used for more productive pursuits.
The service is Mobile Money, currently being hailed as Africa’s savior in terms of providing financial services to the millions of unbanked populace. Everybody knows that mobile telcom services in Africa have been very successful and are growing by leaps and bounds due to the infrastructure issues associated with fixed line laying, operation and maintenance. Couple the cost of handsets, $10 Nokias are available with a battery that can last 5 to 7 days, oh yes, coupled with SMS has lead to mHeath, mEducation initiatives being developed.
Mobile money has been a core driver of mobile service usage in the last few years coz it makes it easy to move money without the hassles of banks (line up, service fees) and with the licensing of thousands of agents (there are now more agents than bars and supermarkets and groceries combined), means that getting access to money is as easy as moving to your local grocery store.
However MTN Uganda (http://mtn.co.ug/) is a market leader in Uganda and currently holds the market leadership position, I would put it at over 70% but I can be corrected, with the greatest reach within the country. The service is estimated to transact about UGX 5bn ($2.2m at current rates) per day which is quite high considering averge transaction values are in the $10 – $100 range.
Anyway their success is maybe their undoing, because despite the phenominal growth, the service is even worse the electrictity availabiltity with the platform having an average uptime of 50% during normal working hours, after a 45 day downtime during November 1, 2011 – December 15, 2011 (which started as an upgrade then later turned into an outage).
From my software engineering background I am still baffled at why this continuously happens to one of the largest telco providers due to the established DevOps (http://devops.com/ and http://en.wikipedia.org/wiki/DevOps) practices: what are the possible solutions or approaches:
- High Transaction Volumes
- Hardware – buy more hardware throw more power at the problem
- Software – not scalable then run a cluster of boxes across the switches, load balance the sessions this problem is available even with HTTP
- Interface Operations – In database speak we usually state separate writes from reads. Separate balance checking (reads) from withdrawals and deposits (writes) into separate distinct applications behind the interface. Use Queues, Gearman to ensure that the transactions are completed. Have the reads, balance checks run off slaves in the clusters …
- Notifications – SMS Messages are good, for delivery but ensure they are sent and delivered. Queue the notifications so that they are always sent
- Provide options to execute transactions – provide a web interface for clients and agents. This opens up new revenue and agent opportunities since Internet cafe owners can also provide services from their interfaces. This is just an alternate way to access the service
- Be open to the public to lower the expectations – provide updates on service outages so that users do not just keep trying and only finding out from many failed trials. Failed transactions have been identified a known cause of application load spikes
- Reduce the number of available services and offload some services to other channels
- Use opensource software it has been proven to scale – or maybe some newer versions of your software applications
- New – Provide APIs so that developers can provide custom solutions to offload processing off your core system (switchboard)
These are just quick thoughts but they should be sufficient to start the discussion … not only rant and rave but also provide some concrete solutions