Managing the data deluge of high-content screening workflows
Integral to many research and testing facilities are the high-content screening (HCS) technologies that continue to evolve alongside imaging and analysis systems. HCS platforms serve a multitude of purposes, from screening the effects of potential drug compounds on cellular processes to facilitating the study of complex biological pathways.
Imaging experiments can yield valuable insights into a variety of biological parameters, including cell morphology, protein localization and expression, cell proliferation, apoptosis, and cell migration. They are also instrumental in genome-wide RNA interference (RNAi) and CRISPR screens, as well as drug toxicology and safety evaluation.
While microscopy has become a laboratory mainstay, HCS platforms have undergone significant advancements in terms of scale, speed, and image resolution over the last decade. Many laboratories have adopted fully automated workflows to streamline imaging processes and enhance the reproducibility of results. Additionally, by utilizing automated systems, researchers minimize the risk of human errors in sample handling, imaging settings, and data analysis.
Although HCS has provided a more comprehensive, efficient, and data-rich approach to drug discovery, automated imaging systems generate hundreds of thousands of images each day, resulting in large amounts of data that need to be stored and analyzed.
Paradoxically, the greatest strength of HCS – the vast amount of high-content data it generates – is also one of its greatest challenges.
Too much data to handle?
Regis Doyonnas, Lab Head of Pfizer’s High Content Screening Facility in Connecticut, describes how the company’s classical high-content screening workflows can produce over 80 million images per year. “We run a broad spectrum of assays, including protein co-localization, cell activation, phagocytosis, and GPCR translocation,” he says. “So, as you can imagine, we generate a lot of images and that’s a lot of data to handle.”
As laboratories and screening groups continue to adopt image-based technologies, the primary bottleneck is the lack of robust data storage systems. These systems not only need to facilitate efficient accessibility to the data but also ensure that the captured data remains readily available for future use. Any shortcomings in data storage and management limit the speed at which valuable data can be collected and slow data analysis and research progress.
Rafael Fernandez, Associate Director for Merck’s Research Labs Information Technology, acknowledges that the increased throughput and data output of HCS platforms is a primary challenge for those adopting imaging workflows. “The rapid development of automated microscopy technologies has consistently outpaced IT for over two decades,” he explains. “Now that everyone is doing imaging, we are going to have to play catchup.”
Fernandez elaborates that Merck has a vendor-agnostic image data storage solution that harmonizes the data and allows analysis from diverse instruments. “We also have an agile data storage archive/retrieval solution and a simple user interface (UI) to enable the end-user scientists to rapidly go through the data without having to rely on heavy support from bioinformaticians,” he says.
Turning to the cloud
While imaging data was traditionally stored on-premises, or ‘on-prem’, there is now a shift to cloud-based storage solutions, motivated by advantages such as scalability, accessibility, and data security. Laboratories can seamlessly transfer image data from imaging systems or analysis platforms to cloud-based storage services, such as Amazon Web Services (AWS), enabling large-scale data transfer and the ability to share data with collaborators or external partners.
“Our workflows are tightly integrated, utilizing robots and remote systems to transport plates for automatic analysis during the night,” explains Doyonnas. “It takes about ten minutes to transfer the information from one plate to the cloud, and when they arrive, they are automatically analyzed in Signals Image Artist™. By the time researchers return in the morning, the data has been efficiently transferred, analyzed, and made ready for further processing, all while they’ve slept.”
The ability to address the challenges presented by large volumes of imaging data is crucial for laboratories. The adoption of a cloud-based solution offers a more efficient, scalable, and secure approach to data management in the realm of HCS. To hear more about how Fernandez and Dotonnas have been handling the growing demand for imaging data at Pfizer and Merck, watch our webinar: ‘Addressing the IT challenges of image data management and analysis for high-content screening.’