USENIX LISA: When 1000 Computers are too Slow
Open source developer Tobias Oetiker, in his presentation at the USENIX LISA conference in San Diego, described how he could help one thousand Windows users at Swisscom.com speed up their computers.
By Oetiker's account, the major assignment he took on was thanks to a prize he had received at the LISA conference two years earlier for his work on two open source projects, MRTG and RRDTool. MRTG stands for Multi Router Traffic Grapher, software that depicts network load based on SNMP traffic through interfaces. The effort got him on the front page of a Swiss newspaper, soon after which Swisscom.com hired him to resolve their computer problems. At the conference he reported on his unusual assignment with the talk "How to Proceed When 1000 Call Agents Tell you, 'My Computer is Slow': Creating a User Experience Monitoring System."
The reproducibility factor was the first of Oetiker's challenges: users would complain of slow systems, but had trouble reproducing the conditions or symptoms. Oetiker's IT experts attacked this from two sides. First, they passively monitored the systems by using CPV Monitor, a software especially created for this project. Secondly, users themselves could voluntarily use CPV reporter for active monitoring and send error notifications directly to the IT team. Despite CPV not exactly being free software, the benefit for Oetiker was that it used many open source packages and was written in perl.
These two solutions had a handle on many of the problems, but more kept emerging and the data collected from the 1,429 units daily amounted to more than two million entries and quickly reached a critical mass. Performance on the server suffered considerably, so the technicians decided to select a part of the assembled data, thus reducing it to 12% of its original volume, based on a random selection of units.
After problem solving in this manner became more manageable, yet another challenge confronted the team: causal research into Windows behavior and responses. Oetiker provided a number of anecdotes. To the joy of talk participants, he commented on Windows Explorer's reaction to a non-responding application: "The Not Responding message is more or less for your entertainment" and has nothing to do with what's actually happening on the computer. Oetiker's conclusion, based on a participant’s question, was that "For me, Windows continues to remain a complete mystery."
One thing he could take from his project was, "From the moment users sent in their problems via CPV Reporter, they subjectively had the feeling things were getting better, even though we hadn't even begun working on a solution. They in fact were part of the solution."