Saturday, April 6, 2013

Uploading to MEGA: the CPU bottleneck in VPS/servers

Many people have reported that they get slow speed uploading to MEGA - both using the browser or by using MegaUploader.

The test machine I use is a "4-years-old" Q6600, 4 cores, with a 5Mbps upload connection. It uploads at maximum speed (600KB/s).
However, I had access to a VPS recently, and I did some tests. Althogh it has a 1Gbps shared connection, it never gets more than 300KB/s per file. It was very strange.

With just one file, the upload speed was 350KB/s, with peaks of CPU of 80% - the VPS has only 1 core. However, uploading 2 files at the same time, it uploaded about 700KB/s, with the CPU at 80% with peaks of 100%! Trying to upload 3 files was a terrible idea. Yeah, I got (a unstable) 1MB/s upload speed, but the CPU was always at 100%, and the VPS was slooooow. I was kicked several times from RDP.

After some tests and traces, I arrived to a conclusion: the CPU is a BIG bottleneck!

Information is uploaded to MEGA in chunks of 1MB (well, technically the firsts chunks are smaller and then they get a size of 1MB).
MegaUploaders creates a thread for each file being uploading. Each thread ciphers a chunk (applies an AES cipher) and the creates a parallel task to upload that chunk. Meanwhile, it starts to cipher the next chunk. Once the next chunk is codified, it waits the upload thread to finish - in this way a queue is avoided, so memory usage is always low. As you can see, there are a lot of threads, and if you have a multi-core processor, it will be great.

With home connections, the bottleneck is the upload process. While the chunk is being uploaded, another thread is ciphering the next chunk. Once it is ciphered, it waits until the upload thread is free. In this way the CPU works, and then waits. That's the reason you see "CPU peaks" if you look at the task manager. Home PCs have normally two or more cores (nowadays is strange to have a CPU with just one core), and the upload speed is terrible, so MegaUploader squeezes the connection: if the CPU is faster ciphering each chunk than uploading it, the bottleneck is your conneciton.

However, with a VPS (a cheap one), we have the opposite case. Normally, the VPS may have just one core, or if you get a dedicated low-end server (like a Kimsurfi) you have an Atom core... with a 100Mbps upload speed - or maybe 1Gbps connection!

This means that the upload process is very fast... but the ciphering process is not. So the CPU doesn't have to wait the upload thread to finish - it is always working, and the upload process has to wait! And with just one core, the more threads you create - aka the more files you upload, the more context changes there are - and the VPS gets slower.

So, to sum up, uploading to MEGA requires not only bandwidth, but a lot of CPU. If you have a great upload connection, you will also need many Ghz and many cores in order to avoid a bottleneck.

If you use a cheap VPS/server to upload, you will experiment slow speed and high CPU consumption.

On my side, I will check the CPU consumption during the ciphering process - most of the cycles are consumed, not for ciphering the data, but for generating the CBC-MAC code, a 16 bytes key that is necessary when uploading the file - if that CBC-MAC is not correct, you get the famous "decryption error" when downloading. Maybe I will be able to improve the performance of that process. I hope it ;)