What or who is Ufi learndirect?
In 1998 the concept of a ‘University for Industry’ led to the creation of Ufi, branded as learndirect. In the eight years since its inception, learndirect has become the largest e-learning network of its kind worldwide, and has pioneered the delivery of learning to a mass audience through a unique combination of flexibility, accessibility and support.
This article examines the infrastructure underpinning learndirect’s learner management system (LMS), describes learndirect’s service characteristics and explores the role that ‘right sourcing’ has played in learndirect’s success.
Technical and Service Context
The learndirect LMS:
- Facilitated 447,000 learners last year.
- Typically deals with 4,000 concurrent learners at peak times.
- Typically consumes 80 Mb/s of bandwidth.
This number of transactions means that an outage can have significant impact; yet the learndirect LMS averages 99.98% systems availability.
Like most LMSs, the learndirect system provides a more personalised experience than typical ‘content sites’ such as the BBC or CNN. While such sites may offer some personalisation, this is typically limited to a sub-set of content that is presented according to preferences or tracked activity. Critically, the content itself does not change from consumer to consumer, or in response to user interaction. As a result such sites can load-balance their content across a number of servers or caches and require relatively little tracking.
In contrast, systems such as the learndirect LMS track a learner’s progress through a piece of learning and adapt content in response to learner engagement (for example via personalised formative assessment). As a result, the learndirect LMS requires an authoritative repository with which to track consumer behaviour.
When presented with a slow or unresponsive site, a consumer can choose to go elsewhere; however, with web-delivered learning there is nowhere else to go. But it is not enough that a system is available and returns content. If e-learning is to be effective, the medium needs to be as un-intrusive as possible: content has to render without the consumer becoming aware of any delay. This presents us with a two-fold challenge: each user’s content is customised and there is a service expectation of 100% availability and responsiveness. In addition, we have the problems for any large-scale system that aspires to be available 24 x 7. Constructing such a service is a serious web engineering exercise!
So, the first key differentiator between running a service and a system is monitoring. If you are not monitoring the service, then you are just running a system. Without appropriate monitoring software it is likely that the first person to tell you that your service has a problem will be one of your consumers, and in all probability they won’t tell you immediately.
Choose the right tools
When the learndirect service was first constructed, a very expensive piece of software was purchased to perform system monitoring. Unfortunately, the load associated with that particular tool was sufficient to harm the system. The tool itself was sold as the usual universal panacea. However, in implementation it was clear that its forte was component monitoring and not service monitoring.
We evaluated some alternative, open source, monitoring products available at that time and selected two: Nagios for event monitoring and Cacti for trend and volume monitoring.
We use Nagios to monitor events from two locations:
- Inside our Demilitarized zone (DMZ). Nagios looks at the system every 90 seconds against predefined thresholds. These give us a status of Green, Amber and Red.
- The public Internet. From this location, we can look at the service(s) from the end user's perspective.
Implementing such a tool is not to be undertaken lightly. More difficult than installing the system in the first place are: a) Getting the sensitivity correct (in-order to prevent false alarms); and b) establishing the operations culture so that staff respond rapidly when an alert is sent out.
Trend and volume monitoring
While Nagios tells us when we have a specific issue/problem, Cacti provides us with the information to understand or diagnose the root cause of the problem. In measuring volumes and their trends, Cacti allows us to look across the whole application stack at any point in time and examine critical volumes.
As open source tools, both Nagios and Cacti are highly extensible. We have been able to write and adapt agents to interface with them and we have been able to monitor and show trends in all our services.
Culture and tools
After we had implemented an IT infrastructure library (of common definitions and terminology), the single most important cultural change we made was projecting the critical volumes provided by Nagios and Cacti onto big flat screens, visible to everyone in our operations and service team.
Most government contracts are outsourced, and it is rare for suppliers to allow monitoring at this level. Establishing a right to monitor in our original contract was an important factor in getting transparency in our service management.
Do you have a technology strategy?
Most organisations have either an implicit or explicit technology strategy. At the risk of stating the obvious, the technology and service model that an organisation chooses can mean the difference between a successful business and one that fails. As a consequence, organisations and IT directors tend to be conservative in their decision-making.
Ufi's Technology Strategy provides us with a framework that allows the organisation to make ‘good’, strategic choices. These choices are deployed within a governance framework to ensure that business and service models dependant on technology can be delivered now and in the future.
At a simplistic level, technology is used for three things within an organisation:
1. To run the business
2. To change the business
3. To innovate.
Unless you are a start-up, the bulk of investment and cost is already sunk in running your company. Changing the company information strategy usually occurs incrementally and takes the form of modifying the status quo. We are left with the shiny innovation tip of the cost iceberg to introduce new ways of doing things. Accepting this to be the case, we can see that technology strategies have considerable inertia, and unless there are strong external pressures (failure to meet service levels, company financial pressure, loss of market share), the adoption of new technologies is going to be slow.
In-source, out-source, right-source
The last ten years has seen the trend to out-source IT services and development continue to increase. This is unsurprising, given the risk and cost of getting it wrong. Out-source companies carry the allure of having solved all problems previously and a large pool of experienced staff. Many organisations have reduced the cost and risk of running their IT systems significantly as a result of out-sourcing.
Central to a successful out-sourced system is a well-specified contract and a description of service requirements. Examples of good candidates for out-sourcing are payroll or desktop management. In both cases, an organisation can describe what it is that it wants and the amount of future change required can be estimated accurately.
If your IT application is the core of what your organisation does (as is the case with the learndirect LMS) and if you know you are going to undergo an annual cycle of change, then managing your operations in-house should be considered. Having taken learndirect's operations in-house, we have seen a significant reduction in cost and have improved service availability to a position where it is now better than 99.9%.
Right-sourcing is therefore getting the balance right between those services you choose to run yourself and those you are prepared to allow others to run for you. This balance will, and does, change as organisations and technology mature.
If you have brought your application development or hosting in-house then you have the opportunity to exploit open source tools and applications for competitive and/or service advantage. Having done this with the operation and now the development of our core application, we have put open source technology at the core of our technology strategy.
While we retain Oracle as our database of choice we have adopted a wide range of open source tools, including Apache, SQUID, JBOSS, Hybernate, MySQL and Linux. The advantages are obvious:
- They are standards compliant, or effectively comprise a cross-platform standard in their own right.
- They are robust and open to peer review: issues and problems can be rapidly identified and resolved.
- They are often designed and built by practitioners and as such often come with built in solutions for real world problems.
- They increasingly come with support contracts.
Key lessons learned
We have learned a number of valuable lessons from our experience running learndirect's LMS service:
- Don’t confuse running a service and running an application. Monitoring and non-functional requirements such as usability, performance security and the ability to change, have high impact.
- Monitoring and its application is critical in running a service.
- Getting a technology strategy that supports the business and recognizes that once started it’s often expensive to change.
- In-sourcing /out-sourcing right-sourcing will impact what you can control.
- Open source tools can be used to run world-class infrastructure.
It should be noted that the views expressed in this piece are my own and do not necessarily represent those of my organisation.
Director of Technology