Tuesday, January 26, 2016

Thousands of Containers, Millions of Containers: Towards Sufficiency of Infrastructure Components

In this week's LinuxONE announcement, containers are mentioned a couple of times. Docker has been used as underlying infrastructure for some of the announcement's demo workloads, and when implementing such a new project, microservices and containers seem a natural deployment paradigm choice. Further sound bites are about scaling over the number of Docker containers on LinuxONE, so what is behind that? For those who know me personally, you've got to watch the video. They made me wear a *black* T-Shirt, can you image!

A first experiment has run one million containers on a single Emperor/z13 box. Doing the maths, a million containers scheduled on 141 CPUs does in average not leave a lot of cycles for a container, so why bother? Well, it all started when one of the techies thought it would be a cool demo, and other groups out there were shooting for big numbers (then thousands), too. Reaching a million was some setup work, but nothing special was required to get there.
Technical details show it just leverages the scale and scalability of the z platform: 141 CPUs, 10TB RAM, 10 LPARs with 100 guests each, running 1200 containers each. A SLES 12 Linux image with our Docker binary 1.8.2 from developerWorks, and net=host as docker option have been used in this experiment; the workload has been a busybox container image running a web server. Startup time for all the containers was under an hour, and this was still a gccgo based Docker binary -- golang seems to make a difference there and speed up start up time significantly, so there's room for getting much faster again. And one million did not really appear to be the end of possibilities...

What can we learn from this experiment? You can go far with Docker on a mainframe, further than you likely need: if all of the containers want to constantly run, even the maximum number of 141 CPUs certainly will not be sufficient, so the workload will typically be your limit. However, if you think of individual user sessions handled by individual containers, the load pattern changes: a user might think/read/write for a while before triggering the next action on the server side logic, so a very large number of containers started simultaneously becomes a realistic scenario all of a sudden.

How far can you go within a single Linux instance? We have reached 10.000 containers, when working around some common code limitations as described in this developerWorks article which was doing this first on Power. Workload-wise, a mix of containers has been deployed: 4.000 Apache Solr search engine containers with default networking setup (NAT) getting constant requests from a load generator (JMeter). 6.000 busybox containers (set up like in the 1 million case with net=host) have been added, sitting there and waiting, just started to further clog the system. Technical details: 36 CPUs with SMT active, 755 GB RAM, and a 4.3 kernel/docker 1.10-dev combination picking up the modifications mentioned previously. Solr has been warmed up and working on a 46GB document base. The requests on the Solr containers were still served steadily.
Again, this 10.000 container per Linux instance figure is likely more than anyone would use in a real world setup, but it shows you can push until the laws of physics (or the cycle requirements of your workload) stop you. I would not expect real world usage to exceed hundreds to a few thousands of containers per image, though:at some point, the Operating System will not scale when adding processes, that is when you scale over virtual servers. That will also keep startup and thus failure recovery times low.

How does all that compare to other platforms? The 4000 Solr containers in the setup above were also run on an x86 Haswell system with exactly the same hardware and software configuration. The system was running under constrained resource conditions -- memory was getting pressure and of course lots of CPU load. To simulate a more realistic and diverse usage pattern, half of the 4.000 containers did constantly get requests, half of them did not. In that comparison, the z box was delivering roughly half the latency, leading to about double the throughput. Similar ratios are revealed when using smaller scale runs with smaller environments.

Which brings us to performance and Docker in general. Containers use name spaces (a Linux kernel technology for resource scoping) as well as control groups for managing resources. Docker is responsible for starting and managing applications and we have seen it scales for that task. However, once the workload has been kicked off and is running, applications run directly on the Linux kernel. That means, workload performance is much closer tied to system performance, than to Docker. So all the performance advantages as shown by the LinuxONE workload performance proof points apply to container environments, too.

Kudos to the Toronto ecosystem team for running the 1M case, and to the Tokyo Research Lab team for running the 4k and 10k work.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.