Heinzelmännchen to analyse X-ray images:
How legendary gnomes help us understand nature

A researcher group from Göttingen is using consumer-grade computers to cope with Big Data, produced with Dectris detectors at synchrotron radiation sources.

How does life work? This fundamental question of biology is nowadays addressed also by chemists and physicists. Interdisciplinary researchers join forces, everyone equipped with different training, expertise, and experiments. Using X-rays at large scale research facilities like the ESRF in Grenoble or the PETRA III synchrotron in Hamburg, researchers can deeply look into tissue, and zoom into single biological cells. Researchers from the teams of Prof. Tim Salditt and Prof. Sarah Köster at the University of Göttingen in Germany are putting forward new experimental methods to capture bio-physical processes taking place in biological cells. This requires imaging methods that go well beyond what has been possible with light microscopes: They are using X-rays.

Sarah Köster explains their goal: “Biological cells are extremely complex systems, combining numerous different components and functions. With diffractive X-ray imaging, we exploit the small wave length, the ability to directly image electron density contrast, and the high penetration power of X-rays.” Recently, the researcher teams have taken a closer look at the myelin structure in nerve fibres, cytoskeletal networks in eukaryotic cells, and are studying the packing of DNA in bacterial nucleoids.

A close-up of one Heinzelmännchen 19" rack drawer, used as a dedicated cluster for X-ray scanning nano-SAXS data analysis. Using modern EigerX 4M detectors by Dectris, hundreds of Gigabytes of data can be collected at Synchrotron Radiation Sources within a few minutes; a parallel analysis of such large datasets depends on fast analysis hardware and flexible software, as is currently developed by Markus Osterhoff at the Institute for X-Ray Physics, Uni Göttingen.

“Using scanning nano-diffraction, we bridge two worlds of X-ray imaging: A highly focused beam is used to resolve the different constituents inside biological cells. From the diffraction patterns we get at each position, we can extract structural information on molecular length scales”, Markus Osterhoff from Göttingen explains the experimental method. The sample is raster-scanned and studied at millions of different positions; at each point in space, the EigerX detector by Dectris produces an “image” with four million pixels. This can now be done with 750 Hz, leading to data rates of more than six Gigabytes per second.

Luckily for the researchers and their IT infrastructure, this tremendous data stream is compressed in real-time. Nevertheless, one Blue Ray disc could be filled every five to ten minutes. “Recently, we collected one Terabyte of data per day, for one week of beamtime at the ESRF synchrotron in Grenoble”, Osterhoff states the problem: within a couple of hours, the number of data points becomes a 1 with 13 zeros.

Each of the millions of detector images holds more than four million values; but in the end, every frame is reduced to only one number. Which number? The group at the university of Göttingen does not yet know. They are establishing a new quantitative imaging method, and are developing the physical model and mathematical formulas for that. This means that they have to re-process the data sets over and over again, changing parameters of the algorithm and comparing the results.

Network connections of the 24-node Heinzelmännchen cluster, used as a dedicated cluster for X-ray scanning nano-SAXS data analysis.

The group in Göttingen started to analyse the data on specialised hardware they are otherwise using for X-ray tomography; this three-dimensional technique needs to handle all images at the same time, and requires large amounts of memory. For the scanning SAXS method, on the other hand, it turned out that the actual calculation is very fast, but the amount of data becomes too large for one many-core parallel computer. Instead, the team acquired 24 consumer-grade computers, which are now working in parallel. To cut down costs and reduce the required space, the computers only consist of essential hardware: mainboard, processor and memory; four nodes share one power supply. ”A control PC distributes the work and data streams to the computing nodes; now we have 24 `Heinzelmännchen' working on our EigerX datasets, and in principle we could cope in real-time with four EigerX detectors running in parallel at full speed”, Markus Osterhoff is proud of their achievement. Now they are combining the scanning technique with X-ray holography and super-resolution optical microscopy to get an even sharper and more detailed look into single cells.