Ai-ready (86)

Ai-ready (86)#

2018 Data Science Bowl#

Booz Allen Hamilton

Published 2018-01-16

Licensed RESTRICTIVE LICENSE

This dataset contains a large number of segmented nuclei images. The images were acquired under a variety of conditions and vary in the cell type, magnification, and imaging modality (brightfield vs. fluorescence). The dataset is designed to challenge an algorithms ability to generalize across these variations.

Tags: Nuclei Images, Restrictive License, Ai-Ready, Exclude From Dalia

Content type: Data

https://www.kaggle.com/c/data-science-bowl-2018/data

3D Ground Truth Annotations of Nuclei in 3D Microscopy Volumes#

Alain Chen, Liming Wu, Seth Winfree, Kenneth Dunn, Paul Salama, Edward Delp, Teresa Zulueta-Coarasa

Published 2024-12-20

Licensed CC-BY-4.0

This submission contains a set of 3D microscopy volumes of cell nuclei from different species and tissues that have been manually segmented. We also provide synthetically generated 3D microscopy volumes that can be used for training segmentation methods.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.ebi.ac.uk/bioimage-archive/galleries/ai/analysed-dataset/S-BIAD1518/

3D HL60 Cell line (synthetic data)#

David Svoboda, Michal Kozubkek, Stanislav Stejskal

Published 2009-06-01

Licensed CC-BY-3.0

One of the principal challenges in counting or segmenting nuclei is dealing with clustered nuclei. To help assess algorithms performance in this regard, this synthetic image set consists of four subsets with increasing degree of clustering. Each subset is also provided in two diferent levels of quality: high SNR and low SNR.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC024

3D cell shape of Drosophila Wing Disc#

Giulia Paci, Ines Fernandez Mosquera, Pablo Vicente Munuera, Yanlan Mao

Published 2023-08-14

Licensed CC0-1.0

Segmentation masks of individual cells in Drosophila wing discs

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.ebi.ac.uk/bioimage-archive/galleries/S-BIAD843-ai.html

3D light-sheet microscopy data for SELMA3D 2024 challenge - Training subset with annotations#

Ying Chen, Johannes C. Paetzold, Ali Erturk, Doris Kaltenecker, Mihail Todorov, Harsharan Singh Bhatia, Shan Zhao, Luciano Höher

Published 2024-06-05

Licensed CC-BY-4.0

This dataset is the training set with annotations for the SELMA3D challenge. The SELMA3D challenge focuses on self-supervised learning for 3D light-sheet microscopy image segmentation. Its objective is to encourage the development of self-supervised learning methods for general segmentation of various structures in 3D light-sheet microscopy images. The dataset comtains 3D image patches of different labeled biological structures in the brain, including blood vessels, c-Fos labeled brain cells involved in neural activity, cell nuclei, and Alzheimers disease plaques. Each patch includes corresponding pixel-wise annotations for the labeled structures.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.ebi.ac.uk/bioimage-archive/galleries/ai/analysed-dataset/S-BIAD1196/

3D nuclei instance segmentation dataset of fluorescence microscopy volumes of C. elegans#

Fuhui Long, Hanchuan Peng, Xiao Liu, Stuart K Kim, Eugene Myers, Dagmar Kainmüller, Martin Weigert

Published 2022-02-01

Licensed CC-BY-4.0

The dataset consists of 28 confocal microscopy volumes of C. elegans worms at the L1 stage and corresponding stacks of densely annotated nuclei instance segmentation masks.

28 raw images and corresponding masks of average dimension (xyz) 1050 x 140 x 140
Pixelsize (xyz): 0.116 x 0.116 x 0.122μm
Microscope: Leica confocal microscopy, 63x oil objective

The original raw data and preliminary annotations were part of the following publication (please cite if you use the dataset): Long, F., Peng, H., Liu, X., Kim, S. K., & Myers, E. (2009). A 3D digital atlas of C. elegans and its application to single-cell analyses. Nature methods, 6(9), 667-672.

The nuclei annotation masks were further manually curated by Dagmar Kainmueller (MDC Berlin) for the following publication:

Hirsch, P., & Kainmueller, D. (2020). An auxiliary task for learning nuclei segmentation in 3d microscopy images. In Medical Imaging with Deep Learning (pp. 304-321). PMLR.

We provide the dataset already structured into the train/validation/test split as used by the above as well as the following publications:

Weigert, M., Schmidt, U., Haase, R., Sugawara, K., & Myers, G. (2020). Star-convex polyhedra for 3d object detection and segmentation in microscopy. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 3666-3673).

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/5942575

https://doi.org/10.5281/zenodo.5942575

A deep learning approach to quantify auditory hair cells#

Maurizio Cortada, Loïc Sauteur, Michael Lanz, Soledad Levano, Daniel Bodmer

Published 2021-03-09

Licensed CC-BY-4.0

StarDist 2D deep learning model and training dataset.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/4590066

https://doi.org/10.5281/zenodo.4590066

An annotated fluorescence image dataset for training nuclear segmentation methods#

Sabine Taschner-Mandl, Inge M. Ambros, Peter F. Ambros, Klaus Beiske, Allan Hanbury, Wolfgang Doerr, Tamara Weiss, Maria Berneder, Magdalena Ambros, Eva Bozsaky, Florian Kromp, Teresa Zulueta-Coarasa

Published 2023-03-07

Licensed CC0-1.0

Ground-truth annotated fluorescence image dataset for training nuclear segmentation methods

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.ebi.ac.uk/bioimage-archive/galleries/S-BIAD634-ai.html

An annotated high-content fluorescence microscopy dataset with Hoechst 33342-stained nuclei and manually labelled outlines#

Malou Arvidsson, Salma Kazemi Rashed, Sonja Aits

Published 2022-06-17

Licensed CC-BY-4.0

Here we present a benchmarking dataset of fluorescence microscopy images with Hoechst 33342-stained nuclei together with annotations of nuclei, nuclear fragments and micronuclei. Images were randomly selected from an RNA interference screen with a modified U2OS osteosarcoma cell line, acquired on a Thermo Fischer CX7 high-content imaging system at 20x magnification. Labelling was performed by a single annotator and reviewed by a biomedical expert.

The dataset contains 50 images showing over 2000 labelled nuclear objects in total, which is sufficiently large to train well-performing neural networks for instance or semantic segmentation. It is pre-split into training, development and test set, each in a zip file. The dataset should be referred to as Aitslab_bioimaging1. A brief article describing the dataset is also available (Arvidsson M, Kazemi Rashed S, Aits S. 10.1016/j.dib.2022.108769 )

Dataset description:

Fluorescence microscopy images: original .C01 files and files converted to 8-bit .png format (Grayscale)

Annotations: 24-bit .png format (RGB)

Script used to convert C01 to png images: C01_to_png.py file with python code and readme.md file with instructions to run it

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/6657260

https://doi.org/10.5281/zenodo.6657260

An image-based data-driven analysis of cellular architecture in a developing tissue#

Jonas Hartmann, Mie Wong, Elisa Gallo, Darren Gilmour

Published 2022-12-13

Licensed CC-BY-4.0

3D zebrafish embryo images with single-cell segmentation and point cloud-based morphometry

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.ebi.ac.uk/bioimage-archive/galleries/S-BIAD599-ai.html

Assessment of Residual Breast Cancer Cellularity after Neoadjuvant Chemotherapy using Digital Pathology#

Mohammad Peikari, Sherine Salama, Sharon Nofech-Mozes, Anne L. Martel

Published 2017-10-04

Licensed CC-BY-3.0

Breast cancer (BC) is the second most commonly diagnosed cancer in the U.S. with more than 250,000 new cases of invasive breast cancers reported in 2017. The majority of women with locally advanced and a subset of patients with operable breast cancer will undergo systemic therapy prior to their surgery (neoadjuvant therapy/ NAT) to reduce the size of tumor(s) and possibly further undergo breast conserving surgery. The Post-NAT-BRCA dataset is a collection of representative sections from breast resections in patients with residual invasive BC following NAT. Histologic sections were prepared and digitized to produce high resolution, microscopic images of treated BC tumors. Also included, are clinical features and expert pathology annotations of tumor cellularity and cell types. The Residual Cancer Burden Index (RCBi), is a clinically validated tool for assessment of response to NAT associated with prognosis. Tumor cellularity is one of the parameters used for calculating the RCBi. In this dataset, tumor cellularity refers to a measure of residual disease after NAT, in the form of proportion of malignant tumor inside the tumor bed region; also annotated. (See MD Anderson RCB Calculator for a detailed description of tumor cellularity.) Malignant, healthy, lymphocyte and other labels were also provided for individual cells to aid development of cell segmentation algorithms.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.cancerimagingarchive.net/collection/post-nat-brca/

Automatic labelling of HeLa “Kyoto” cells using Deep Learning tools#

Romain Guiet

Published 2022-02-25

Licensed CC-BY-4.0

Name: Automatic labelling of HeLa “Kyoto” cells using Deep Learning tools

Data type: Microscopy images from the dataset “HeLa “Kyoto” cells under the scope”, Brightfield (BF), Digital Phase Contrast (DPC, either “raw” or “square-rooted”), Tubulin and H2B fluorescent channel, paired with their corresponding nuclei or cell/cyto label images.

Labels images: Labels images were generated using the script “prepare_trainingDataset_cellpose.ijm”.

Briefly, for 5 defined time-points (1,10,50,100,150), channels of interest were duplicated, resaved and :

- nuclei label images were obtained using StarDist on H2B channel

- cell label images were obtained using Cellpose on Tubulin and H2B channels

A quick visual inspection of the resulting label images concluded that they were satisfying enough, despite certainly not being perfect.

Notes :

- This labelling strategy:

o will not produce 100% accurate labels, but they might be more reproducible than labels generated by humans and are (definitely) much faster to obtain.

o is NOT a recommended way of generating labels images, but for educational purposes.

- The fluorescent channels are part of the dataset to ease the process of review of the labels and are NOT used for training. We generated the labels from the fluorescent channels to later predict labels from the BF or DPC channels only. As such, the fluorescent channels should not be “reused” with our labels during training.

File format: .tif (16-bit)

Image size: 540x540 (Pixel size: 0.299 nm)

NOTE: This dataset uses the “HeLa “Kyoto” cells under the scope” dataset (https://doi.org/10.5281/zenodo.6139958) to automatically generate annotations

NOTE: This dataset was used to train cellpose models in the following Zenodo entry https://doi.org/10.5281/zenodo.6140111

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/6140064

https://doi.org/10.5281/zenodo.6140064

BCCD Dataset#

Shenggan Gan, Nicolas Chen

Published 2017-12-07

Licensed MIT

BCCD Dataset is a small-scale dataset for blood cells detection.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

Shenggan/BCCD_Dataset

Breast Cancer Nuclei images for DL Training + ZeroCostDL4Mic StarDist Model#

Ofra Golani, Vishnu Mohan, Tamar Geiger

Published 2024-05-21

Licensed CC-BY-4.0

Training dataset:Paired microscopy images (fluorescence) and corresponding masks Microscopy data type: Fluorescence microscopy and masks obtained via manual correction of automatic segmentation with pre-trained StarDist model (see qupath/models) Cells were imaged using a 20x objective with a 1x camera adapter was used in conjunction with a pco.edge 4.2 4MP camera on Pannoramic SCAN 150 scanner. Cell type: FFPE tissue sections were sliced from all cancer-containing paraffin blocks File format: .tif (8-bit for fluorescence and 16-bit for the masks) StarDist Model:The StarDist model was generated using the ZeroCostDL4Mic platform (Chamier et al., 2021). This custom StarDist model was trained for 100 epochs using 80 manually annotated paired images (image dimensions: (257, 257)) with a batch size of 2, an augmentation factor of 10 and a mae loss function. The StarDist “Versatile fluorescent nuclei” model was used as a training starting point. Key python packages used include TensorFlow (v 2.2.0), Keras (v 1.1.2), CSBdeep (v 0.7.2), NumPy (v 1.21.6), Cuda (v 11..1.105). The training was accelerated using a Tesla P100GPU.The model weights can be used in the ZeroCostDL4Mic StarDist 2D notebook or in the StarDist Fiji plugin. a QuPath-compatible model is also provided.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/11235393

https://doi.org/10.5281/zenodo.11235393

Breast Cancer Semantic Segmentation (BCSS) dataset#

Mohamed Amgad, Habiba Elfandy, Hagar Hussein, Lamees A Atteya, Mai A T Elsebaie, Lamia S Abo Elnasr, Rokia A Sakr, Hazem S E Salem, Ahmed F Ismail, Anas M Saad, Joumana Ahmed, Maha A T Elsebaie, Mustafijur Rahman, Inas A Ruhban, Nada M Elgazar, Yahya Alagha, Mohamed H Osman, Ahmed M Alhusseiny, Mariam M Khalaf, Abo-Alela F Younes, Ali Abdulkarim, Duaa M Younes, Ahmed M Gadallah, Ahmad M Elkashash, Salma Y Fala, Basma M Zaki, Jonathan Beezley, Deepak R Chittajallu, David Manthey, David A Gutman, Lee A D Cooper

Published 2019-11-09

Licensed CC0-1.0

This repo contains the necessary information and download instructions to download the dataset associated with the paper: Amgad M, Elfandy H, …, Gutman DA, Cooper LAD. Structured crowdsourcing enables convolutional segmentation of histology images. Bioinformatics. 2019. doi: 10.1093/bioinformatics/btz083. This data can be visualized in a public instance of the Digital Slide Archive at this link. If you click the “eye” image icon in the Annotations panel on the right side of the screen, you will see the results of a collaborative annotation.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

PathologyDataScience/BCSS

CellBinDB: A Large-Scale Multimodal Annotated Dataset#

Can Shi, Jinghong Fan, Zhonghan Deng, Huanlin Liu, Qiang Kang, Yumei Li, Jing Guo, Jingwen Wang, Jinjiang Gong, Sha Liao, Ao Chen, Ying Zhang, Mei Li

Published 2024-11-20

Licensed CC-ZERO

CellBinDB is a large-scale, multimodal annotated dataset for cell segmentation. It contains 1,044 annotated microscope images and 109,083 cell annotations, covering four staining types: DAPI, ssDNA, H&E, and mIF. CellBinDB contains samples from two species, human and mouse, covering more than 30 histologically different tissue types, including disease-related tissues. The images in CellBinDB come from two sources: 844 mouse images from internal experiments and 200 human images from the open access platform 10x Genomics. We annotated all images in CellBinDB and provide two types of image annotations: semantic and instance masks. A xlsx file is attached to record the detailed information of each image. In addition, we provide the images and annotations of nine other widely used publicly available cell segmentation datasets downloaded from their original sources, retaining their original formats for ease of use. The file ‘mixed_licenses.txt’ contains the original accessions of the public datasets used in our project and their associated licenses. Please refer to these links for more information about each dataset and its licensing terms, and use it according to the specifications.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/15370205

https://doi.org/10.5281/zenodo.15370205

Cellpose training data and scripts from “Machine learning for histological annotation and quantification of cortical layers”#

Jean Jacquemier, Julie Meystre, Olivier Burri

Published 2024-07-04

Licensed CC-BY-4.0

This Workflow contains all the material necessary to reproduce the cells detection, thanks to the QuPath performed in the paper “Machine learning for histological annotation and quantification of cortical layers” Inside this workflow and dataset, you will find the following folders

QuPath Training Project: A QuPath 0.5.0 project containing all the manual annotations (ground truths) used to train the cellpose model, as well as the script to start the training Training Images and Demo Images: The raw whole slide scanner images needed by the above QuPath project Model: The fodler containing the trained cellpose model cellpose-training Folder: The exported raw and ground truth images that the above cellpose model was trained on Scripts: The QuPath scripts, also located in their respective QuPath projects, that were created for this whole workflow QC: A Jupyter notebook, based on ZeroCostDL4Mic that computes quality metrics in order to assess the performance of the trained cellpose model. The folder also contains the resulting metrics.

Installation and Use If you are going to use the QuPath projects, you need a local QuPath Installation https://qupath.github.io/ that is configured to run the QuPath Cellpose Extension BIOP/qupath-extension-cellpose as well as a working Cellpose installation MouseLand/cellpose Instructions for installation are available from the links above. After that, you should be able to open the QuPath project, navigate to the “Automate > Project scripts” menu and locate the script you wish to run.

train a cell segmentation algorithm in the context of the rat brain Layer Boundaries project
trigger cell segmentation from a QuPath project in a semi-automated pipeline

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/12656468

https://doi.org/10.5281/zenodo.12656468

Chinese Hamster Ovary Cells#

Krisztian Koos, József Molnár, Lóránd Kelemen, Gábor Tamás, Peter Horvath

Published 2016-07-29

Licensed CC-BY-3.0

The image set consists of 60 Differential Interference Contrast (DIC) images of Chinese Hamster Ovary (CHO) cells. The images are taken on an Olympus Cell-R microscope with a 20x lens at the time when the cell initiated their attachment to the bottom of the dish.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC030

Combining StarDist and TrackMate example 1 - Breast cancer cell dataset#

Guillaume Jacquemet

Published 2020-09-17

Licensed CC-BY-4.0

Description: Contains a StarDist example training dataset, a test dataset, and the StarDist model generated using ZeroCostDL4Mic (see HenriquesLab/ZeroCostDL4Mic)

Training dataset: 72 Paired microscopy images (fluorescence) and corresponding masks

Microscopy data type: Fluorescence microscopy (SiR-DNA) and masks obtained via manual segmentation (see HenriquesLab/ZeroCostDL4Mic for details about the segmentation)

Microscope: Spinning disk confocal microscope with a 20x 0.8 NA objective

Cell type: DCIS.COM Lifeact-RFP cells

File format: .tif (16-bit for fluorescence and 8 and 16-bit for the masks)

Image size: 1024x1024 (Pixel size: 634 nm)

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/4034976

https://doi.org/10.5281/zenodo.4034976

Combining StarDist and TrackMate example 2 - T cell dataset#

Nathan H. Roy, Guillaume Jacquemet

Published 2020-09-17

Licensed CC-BY-4.0

Description: Contains a StarDist example training dataset, a test dataset, and the StarDist model generated using ZeroCostDL4Mic (see HenriquesLab/ZeroCostDL4Mic)

Training dataset: 209 Paired microscopy images (brightfield) and corresponding masks

Microscopy data type: brightfield microscopy and masks obtained via manual segmentation (see HenriquesLab/ZeroCostDL4Mic for details about the segmentation)

Microscope: Imaging was done using a 10x phase contrast objective at 37°C on a Zeiss Axiovert 200M microscope equipped with an automated X-Y stage and a Roper EMCCD camera. Time-lapse images were collected every 30 sec for 10 min using SlideBook 6 software (Intelligent Imaging Innovations).

File format: .tif (16-bit for brightfield images and 8 and 16-bit for the masks)

Image size: 1024x1024 (Pixel size: 645 nm)

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/4034929

https://doi.org/10.5281/zenodo.4034929

Combining StarDist and TrackMate example 3 - Flow chamber dataset#

Gautier Follain, Guillaume Jacquemet

Published 2020-09-17

Licensed CC-BY-4.0

Description: Contains a StarDist example training dataset, a test dataset, and the StarDist model generated using ZeroCostDL4Mic (see HenriquesLab/ZeroCostDL4Mic)

Training dataset: Paired microscopy images (brightfield) and corresponding masks

Microscopy data type: brightfield microscopy and masks obtained via manual segmentation (see HenriquesLab/ZeroCostDL4Mic for details about the segmentation)

Microscope: Images were acquired with a brightfield microscope (Zeiss Laser-TIRF 3 Imaging System, Carl Zeiss) and a 10X objective.

File format: .tif (8-bit for brightfield images and 8 and 16-bit for the masks)

Image size: 1024x1024 (Pixel size: 650 nm)

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/4034939

https://doi.org/10.5281/zenodo.4034939

CryoNuSeg#

Amirreza Mahbod, Benjamin Bancher, Isabella Ellinger, Deyun Zhang, Syed Nauyan Rashid

Published 2019-12-31

Licensed CC-BY-NC-SA-4.0

A Dataset for Nuclei Segmentation of Cryosectioned H&E-Stained Histologic Images

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.kaggle.com/datasets/ipateam/segmentation-of-nuclei-in-cryosectioned-he-images

Deep learning segmentation projects of FIB-SEM dataset of U2-OS cell#

Belevich Ilya, Eija Jokitalo

Published 2023-10-26

Licensed CC-BY-4.0

This submission includes ground truth datasets that were used to segment the nuclear envelope (NE), mitochondria, endoplasmic reticulum (ER) and Golgi from a human bone osteosarcoma epithelial cell (U2-OS) imaged using focused-ion beam scanning electron microscopy (FIB-SEM).The full FIB-SEM dataset is deposited to EMPIAR (https://www.ebi.ac.uk/empiar, EMPIAR-11746).

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/10043461

https://doi.org/10.5281/zenodo.10043461

Deep learning training data (JOVE)#

Jessica Heebner, Carson Purnell, Ryan Hylton, Mike Marsh, Michael Grillo, Matt Swulius

Published 2022-11-18

Licensed CC-ZERO

Cryo-electron tomography (cryo-ET) allows researchers to image cells in their native, hydrated state at the highest resolution currently possible. However, the technique has several limitations that make analyzing the data it generates time-intensive and difficult. Hand-segmenting a single tomogram can take hours to days of human effort, but the microscope can easily generate 50 or more tomograms a day. Current deep learning segmentation programs for cryo-ET do exist but are limited to segmenting one structure at a time. Here multi-slice U-Net convolutional neural networks are trained and applied to automatically segment multiple structures simultaneously within cryo-tomograms. With proper preprocessing, these networks can be robustly inferred to many tomograms without the need for training individual networks for each tomogram. This workflow dramatically improves the speed with which cryo-electron tomograms can be analyzed by cutting segmentation time down to under 30 min in most cases. Further, segmentations can be used to improve the accuracy of filament tracing within a cellular context and to rapidly extract coordinates for subtomogram averaging.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/7335439

https://doi.org/10.5061/dryad.rxwdbrvct

DeepBacs – Bacillus subtilis fluorescence segmentation dataset#

Séamus Holden, Mia Conduit

Published 2021-10-05

Licensed CC-BY-4.0

Training and test images of live B. subtilis cells expressing FtsZ-GFP for the task of segmentation.

Additional information can be found on this github wiki.

The example shows the fluorescence widefield image of live B. subtilis cells expressing FtsZ-GFP and the manually annotated segmentation mask.

Data type: Paired fluorescence and segmented mask images

Microscopy data type: 2D widefield images (fluorescence)

Microscope: Custom-built 100x inverted microscope bearing a 100x TIRF objective (Nikon CFI Apochromat TIRF 100XC Oil); images were captured on a Prime BSI sCMOS camera (Teledyne Photometrics)

Cell type: B. subtilis strain SH130 grown under agarose pads

File format: .tiff (8-bit) or .png (8-bit)

For segmented masks, binary masks are used for training of CARE/U-Net models, 8-bit .tif ROI maps for training of StarDist models and .png images for training of pix2pix models

Image size: 1024 x 1024 px² (Pixel size: 65 nm)

Image preprocessing: Images were denoised using PureDenoise and resulting 32-bit images were converted into 8-bit images after normalizing to 1% and 99.98% percentiles. Images were manually annotated using the Labkit Fiji plugin

Author(s): Mia Conduit1,2, Séamus Holden1,3

Contact email: Seamus.Holden@newcastle.ac.uk

Affiliation:

Centre for Bacterial Cell Biology, Biosciences Institute, Newcastle University, NE2 4AX UK
ORCID: 0000-0002-7169-907X

Associated publications: Whitley et al., 2021, Nature Communications, https://doi.org/10.15252/embj.201696235

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/5550968

https://doi.org/10.5281/zenodo.5550968

DeepBacs – Escherichia coli bright field segmentation dataset#

Christoph Spahn, Mike Heilemann

Published 2021-10-05

Licensed CC-BY-4.0

Training and test images of live E. coli cells imaged under bright field for the task of segmentation.

Additional information can be found on this github wiki.

The example shows a bright field image of live E. coli cells and the manually annotated segmentation mask.

Data type: Paired bright field and segmented mask images

Microscopy data type: 2D bright field images recorded at 1 min interval

Microscope: Nikon Eclipse Ti-E equipped with an Apo TIRF 1.49NA 100x oil immersion objective

Cell type: E. coli MG1655 wild type strain (CGSC #6300).

File format: .tif (8-bit)

Image size: 1024 x 1024 px² (79 nm / pixel), 19/15 individual frames (training/test dataset)

1024 x 1024 px² (79 nm / pixel), 9 regions of interest with 80 frames @ 1 min time interval (live-cell time series)

Image preprocessing: Raw images were recorded in 16-bit mode (image size 512 x 512 px² @ 158 nm/px). Images were upscaled with a factor of 2 (no interpolation) to enable generation of higher-quality segmentation masks. Two sets of mask images are provided: RoiMaps for instance segmentation using e.g. StarDist or binary images for CARE or U-Net.

Author(s): Christoph Spahn1,2, Mike Heilemann1,3

Contact email: christoph.spahn@mpi-marburg.mpg.de

Affiliation(s):

Institute of Physical and Theoretical Chemistry, Max-von-Laue Str. 7, Goethe-University Frankfurt, 60439 Frankfurt, Germany
ORCID: 0000-0001-9886-2263
ORCID: 0000-0002-9821-3578

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/5550935

https://doi.org/10.5281/zenodo.5550935

DeepBacs – Mixed segmentation dataset and StarDist model#

Christoph Spahn, Mike Heilemann, Séamus Holden, Mia Conduit, Pereira, Pedro Matos, Mariana Pinho

Published 2021-10-05

Licensed CC-BY-4.0

Mixed training and test images of S. aureus, E. coli and B. subtilis for cell segmentation using StarDist, as well as the trained StarDist model.

Additional information can be found on this github wiki.

Data type: Paired bright field / fluorescence and segmented mask images

Microscopy data type: 2D widefield images; DIC and fluorescence for S. aureus, bright field images for E. coli, and fluorescence images for B. subtilis

Microscopes:

S. aureus:

GE HealthCare Deltavision OMX system (with temperature and humidity control, 37°C) equipped with an Olympus 60x 1.42NA Oil immersion objective and 2 PCO Edge 5.5 sCMOS cameras (one for DIC, one for fluorescence)

E.coli:

Nikon Eclipse Ti-E equipped with an Apo TIRF 1.49NA 100x oil immersion objective

B. subtilis:

Custom-built 100x inverted microscope bearing a 100x TIRF objective (Nikon CFI Apochromat TIRF 100XC Oil); images were captured on a Prime BSI sCMOS camera (Teledyne Photometrics)

Cell types: S. aureus strain JE2, E. coli MG1655 (CGSC #6300) and B. subtilis strain SH130; all grown under agarose pads

File format: .tif (8-bit and 16-bit)

Image size: 512 x 512 px² @ 80 nm pixel size (S. aureus); 1024 x 1024 px² @ 79 nm pixel size (E. coli); 1024 x 1024 px² @ 65 nm pixel size (B. subtilis)

Image preprocessing:

S. aureus:

Raw images were manually annotated by drawing ellipses in the NR fluorescence image and segmented images were created using the LOCI plugin (“ROI Map”). For training, images and masks were quartered into four 256 x 256 px² patches.

E. coli:

Raw images were recorded in 16-bit mode (image size 512x512 px² @ 158 nm/px). Images were upscaled with a factor of 2 (no interpolation) to enable generation of higher-quality segmentation masks.

B. subtilis:

Images were denoised using PureDenoise and resulting 32-bit images were converted into 8-bit images after normalizing to 1% and 99.98% percentiles. Images were manually annotated using the Labkit Fiji plugin

StarDist model:

The StarDist 2D model was generated using the ZeroCostDL4Mic platform (Chamier et al., 2021). It was trained from scratch for 200 epochs (120 steps/epoch) on 155 paired image patches (image dimensions: (1024, 1024), patch size: (256,256)) with a batch size of 4, 10% validation data, 64 rays on grid 2, a learning rate of 0.0003 and a mae loss function, using the StarDist 2D ZeroCostDL4Mic notebook (v 1.12.2). Key python packages used include tensorflow (v 0.1.12), Keras (v 2.3.1), csbdeep (v 0.6.1), numpy (v 1.19.5), cuda (v 11.0.221). The training was accelerated using a Tesla P100GPU. The dataset was augmented by a factor of 3.

The model weights can be used in the ZeroCostDL4Mic StarDist 2D notebook, the StarDist Fiji plugin or the TrackMate Fiji plugin (v7+).

Author(s): Christoph Spahn1,2, Mike Heilemann1,3, Mia Conduit4, Séamus Holden4,5, Pedro Matos Pereira6,7, Mariana Pinho6,8

Contact email: christoph.spahn@mpi-marburg.mpg.de, Seamus.Holden@newcastle.ac.uk, pmatos@itqb.unl.pt and mgpinho@itqb.unl.pt

Affiliation(s):

Institute of Physical and Theoretical Chemistry, Max-von-Laue Str. 7, Goethe-University Frankfurt, 60439 Frankfurt, Germany
ORCID: 0000-0001-9886-2263
ORCID: 0000-0002-9821-3578
Centre for Bacterial Cell Biology, Biosciences Institute, Newcastle University, NE2 4AX UK
ORCID: 0000-0002-7169-907X
Bacterial Cell Biology, Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Oeiras, Portugal
ORCID: 0000-0002-1426-9540
ORCID: 0000-0002-7132-8842

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/5551009

https://doi.org/10.5281/zenodo.5551009

DeepBacs – Staphylococcus aureus widefield segmentation dataset#

Pereira, Pedro Matos, Mariana Pinho

Published 2021-10-05

Licensed CC-BY-4.0

Training and test images of live S. aureus cells for the task of cell segmentation.

Additional information can be found in the github wiki.

The example shows the bright field and Nile Red fluorescence image of live S. aureus cells, as well as the manually annotated segmentation mask.

Data type: Paired DIC/fluorescence and segmented mask images

Microscopy data type: 2D widefield images (DIC and fluorescence)

Microscope: GE HealthCare Deltavision OMX system (with temperature and humidity control, 37°C) equipped with an Olympus 60x 1.42NA Oil immersion objective and 2 PCO Edge 5.5 sCMOS cameras (one for DIC, one for fluorescence)

Cell type: S. aureus strain JE2 grown under agarose pads

File format: .tif (16-bit)

Image size: 512 x 512 px² (80 nm/px) Image preprocessing: Raw images were manually annotated by drawing ellipses in the NR fluorescence image and segmented images were created using the LOCI plugin (“ROI Map”). For training, images and masks were quartered into four 256 x 256 px² patches.

Author(s): Pedro Matos Pereira1,2, Mariana Pinho1,3

Contact email: pmatos@itqb.unl.pt and mgpinho@itqb.unl.pt

Affiliation:

Bacterial Cell Biology, Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Oeiras, Portugal
ORCID: https://orcid.org/0000-0002-1426-9540
ORCID: https://orcid.org/0000-0002-7132-8842

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/5550933

https://doi.org/10.5281/zenodo.5550933

Detection and Segmentation of Cell Nuclei in Virtual Microscopy Images A Minimum-Model Approach#

Stephan Wienert, Daniel Heim, Kai Saeger, Albrecht Stenzinger, Michael Beil, Peter Hufnagl, Manfred Dietel, Carsten Denkert, Frederick Klauschen

Published 2012-07-11

Licensed CC-BY-NC-SA-3.0

A novel contour-based approach to cell detection and segmentation has been developed, which uses minimal prior information and detects contours independently of their shape, avoiding a segmentation bias. This approach has been shown to accurately segment a broad range of normal and disease-related morphological features, with high precision and recall rates.

Tags: Nuclei Images, Ai-Ready, Exclude From Dalia

Content type: Data

https://www.nature.com/articles/srep00503#Sec16

Drosophila Kc167 cells#

Vebjorn Ljosa, Katherine L. Sokolnicki, Anne E. Carpenter

Published 2012-06-28

Licensed CC0-1.0

Drosophila melanogaster Kc167 cells were stained for DNA (to label nuclei) and actin (a cytoskeletal protein, to show the cell body). Automatic cytometry requires that cells be segmented, i.e., that the pixels belonging to each cell be identified. Because segmenting nuclei and distinguishing foreground from background is comparatively easy for these images, the focus here is on finding the boundaries between adjacent cells.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC007

Effect of local topography on cell division of Staphylococci sp.#

Sorzabal Bellido, Ioritz, Luca Barbieri, Beckett, Alison J., Prior, Ian A., Arturo Susarrey-Arce, Tiggelaar, Roald M., Jo Forthergill, Rasmita Raval, Diaz Fernandez, Yuri A.

Published 2021-05-16

Licensed CC-BY-4.0

Dataset.zip

This dataset includes the raw and annotated images used to train a Stardist 2D deep learning model for segmentation of surface attached S.aureus as described in Effect of local topography on cell division of Staphylococci sp.

Stardist2d_Model.zip

Stardist 2D deep learning model for segmentation of surface attached S.aureus, obtained using the StarDist 2D ZeroCostDL4Mic notebook (v 1.12.3).

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/4765599

https://doi.org/10.5281/zenodo.4765599

Embryonic mice ultrasound volumes with body and brain volume segmentation masks#

Ziming Qiu, Matthew Hartley

Published 2023-05-10

Licensed CC0-1.0

Ultrasound images of mouse embryos with body and brain volume segmentation masks

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.ebi.ac.uk/bioimage-archive/galleries/S-BIAD686-ai.html

Fiber and vessel dataset for segmentation and characterization#

Saqib Qamar, Baba, Abu Imran, Stèphane Verger, Magnus Andersson

Published 2024-05-03

Licensed CC-BY-4.0

This repository hosts a comprehensive collection of datasets used to develop an innovative deep learning model designed to enhance the segmentation and characterization of macerated fibers and vessel forms in microscopy images. Included in the deposit are raw images, alongside meticulously prepared training and validation datasets. We present an automated segmentation approach that utilizes the one-stage YOLOv8 model, which has been specifically adapted to process high-resolution microscopy images up to 32640 x 25920 pixels. Our model excels in cell detection and segmentation, demonstrating exceptional proficiency.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/10913446

https://doi.org/10.5281/zenodo.10913446

Go-Nuclear. A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context#

Kay Schneitz, Athul Vijayan, Tejasvinee Mody

Published 2024-06-29

Licensed CC0-1.0

We present computational tools that allow versatile and accurate 3D nuclear segmentation in plant organs, enable the analysis of cell-nucleus geometric relationships, and improve the accuracy of 3D cell segmentation. This biostudies submission includes Arabidopsis ovule model training dataset used in the study. The training dataset is composed of strong and weak nuclei image channels, corresponding ground truth segmentation, cell wall image and associated cell segmentation mentioned in the study. Trained models from the study, a total of 47 trained models are made available from this study. This included 15 initial models, 30 gold models, and 2 platinum models. Models were trained using PlantSeg, Stardist and Cellpose. All image datasets and its segmentation as part of the figures in this study is also available as separate zip files. This includes image dataset from different species and organs as listed below.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.ebi.ac.uk/bioimage-archive/galleries/ai/analysed-dataset/S-BIAD1026/

Ground-truth cell body segmentation used for Starfinity training#

Yuhan Wang, Martin Weigert, Uwe Schmidt, Stephan Saalfeld, Eugene W. Myers, Tim Wang, Karel Svoboda, Mark Eddison, Greg Fleishman, Shengjin Xu, Fredrick E. Henry, Andrew L. Lemire, Hui Yang, Konrad Rokicki, Cristian Goina, Eugene W Myers, Wyatt Korff, Scott M. Sternson, Paul W. Tillberg

Published 2021-03-05

Licensed CC-BY-4.0

Accurate segmentation of volumetric fluorescence image data has been a long-standing challenge and it can considerably degrade the accuracy of multiplexed fluorescence in situ hybridization (FISH) analysis. To overcome this challenge, we developed a deep learning-based automatic 3D segmentation algorithm, called Starfinity. It first predicts its cell center probability and its radial distances to the nearest cell borders for each pixel. It then aggregates pixel affinity maps from the densely predicted distances and applies a watershed segmentation on the affinity maps using the thresholded center probability as seeds.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://janelia.figshare.com/articles/dataset/Ground-truth_cell_body_segmentation_used_for_Starfinity_training/13624268

HPA Nucleus Segmentation (DPNUnet)#

Hao Xu, Wei Ouyang

Published 2023-03-02

Licensed CC-BY-4.0

Download RDF Package

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/7690494

https://doi.org/10.5281/zenodo.7690494

HT1080WT cells embedded in 3D collagen type I matrices - manual annotations for cell instance segmentation and tracking#

Estibaliz Gómez-de-Mariscal, Hasini Jayatilaka, Denis Wirtz, Arrate Muñoz-Barrutia

Published 2021-12-13

Licensed CC-BY-4.0

Human fibrosarcoma HT1080WT (ATCC) cells at low cell densities embedded in 3D collagen type I matrices [1]. The time-lapse videos were recorded every 2 minutes for 16.7 hours and covered a field of view of 1002 pixels × 1004 pixels with a pixel size of 0.802 μm/pixel The videos were pre-processed to correct frame-to-frame drift artifacts, resulting in a final size of 983 pixels × 985 pixels pixels.

Hasini Jayatilaka, Anjil Giri, Michelle Karl, Ivie Aifuwa, Nicholaus J Trenton, Jude M Phillip, Shyam Khatau, and Denis Wirtz. EB1 and cytoplasmic dynein mediate protrusion dynamics for efficient 3-dimensional cell migration. FASEB J., 32(3):1207–1221, 2018. ISSN 0892-6638. doi: 10.1096/fj.201700444RR.

Further information about how to use this data is given in esgomezm/microscopy-dl-suite-tf

This dataset is provided together with the following preprint and if you use it, we would like to kindly ask you to cite it properly:

Estibaliz Gómez-de-Mariscal, Hasini Jayatilaka, Özgün Çiçek, Thomas Brox, Denis Wirtz, Arrate Muñoz-Barrutia, Search for temporal cell segmentation robustness in phase-contrast microscopy videos, arXiv 2021 (arXiv:2112.08817)

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/5979761

https://doi.org/10.5281/zenodo.5979761

Human HT29 colon-cancer cells#

Vebjorn Ljosa, Katherine L. Sokolnicki, Anne E. Carpenter

Published 2012-06-28

Licensed CC-BY-NC-SA-3.0

These images are of human HT29 colon cancer cells, a cell line that has been widely used for the study of many normal and neoplastic processes. A set of about 43,000 such images was used by Moffat et al. (Cell, 2006) to screen for mitotic regulators. The analysis followed the common pattern of identifying and counting cells with a phenotype of interest (in this case, cells that were in mitosis), then normalizing the count by dividing by the total number of cells. Such experiments present two image analysis problems. First, identifying the cells that have the phenotype of interest requires that the nuclei and cells be segmented. Second, normalizing requires an accurate cell count.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC008

Human Hepatocyte and Murine Fibroblast cells Co-culture experiment#

David J. Logan, Jing Shan, Sangeeta N. Bhatia, Anne E. Carpenter

Published 2016-03-01

Licensed CC-BY-3.0

This 384-well plate has images of co-cultured hepatocytes and fibroblasts. Every other well is populated (A01, A03, …, C01, C03, …) such that 96 wells comprise the data. Each well has 9 sites and thus 9 images associated, totaling 864 images.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC026

Human Lung Tissue Microscopy (DIC, Fluorescence, Cell and Nuclei Semantic Instance Annotations)#

Melanie Dohmen, Mirja Mittermaier, Andreas Hocke

Published 2024-02-22

The zip file contains 3 folders (annotations, images and training_splits).The annotation folder contains 3 folders (cell_instances, nuclei_instances and semantic). Cell and nuclei instance annotations are long int tif images, containing numbered instance ids and 0 in the background. Semantic annotations are 8-bit int png files containing the class ids (0: background, 1: normal tissue, 2: erythrocytes, 3: alveolar epithelial type 2 cells, 4: alveolar macrophages, 5: other nuclei, 6: alveolar epithelial type 2 cell nuclei, 7: alveolar macrophage nuclei, 8: cell debris). The image folder contains 4 folders (CD68, DAPI, DIC, proSPC), where DIC contains float valued background-corrected differential interference contrast images, the others contain normalized float-valued fluorescence channels of a multi-plex staining with CD-68 (whole alveolar macrophages), DAPI (any cell nuclei), proSPC (cytoplasm of alveolar epithelial type 2 cell). All images are in tif format. The training split folder contains 3 text files, with the image prefix (compared to images and annotations without ending, i.e. e.g. without “_DIC.tif”) of all cases in the respective subset. With a total of 68 cases, there are 51 cases in the train set, 7 cases in the validation set and 10 cases in the test set.The lung tissue origins from lung surgery of patients, but does not include resected tumors. Please see reference [1]. The images were acquired with a laser scanning microscope with 40x magnification and 1024 x 1024 pixels per image.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/10669918

https://doi.org/10.5281/zenodo.10669918

Human U2OS cells (out of focus)#

Vebjorn Ljosa, Katherine L. Sokolnicki, Anne E. Carpenter

Published 2012-06-28

Licensed CC0-1.0

Since robust foreground/background separation and segmentation of cellular objects (i.e., identification of which pixels below to which objects) strongly depends on image quality, focus artifacts are detrimental to data quality. This image set provides examples of in- and out-of-focus HCS images which can be used for validation of focus metrics.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC006

LMRG Image Analysis Study - FISH datasets#

Kristopoher Kubow, Thomas Pengo

Published 2022-05-18

Licensed CC-BY-4.0

Original image files, label (ground truth) files, and PSF files used in the ABRF Light Microscopy Research Group (LMRG) image analysis study. Simulated 3D confocal fluorescence images of sub-diffraction punctate staining (fluorescence in situ hybridization (FISH) in C. elegans).

See ABRFLMRG/image-analysis-study for more details.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/6560910

https://doi.org/10.5281/zenodo.6560910

LMRG Image Analysis Study - nuclei datasets#

Kristopher Kubow, Thomas Pengo

Published 2022-05-18

Licensed CC-BY-4.0

Original image files, label (ground truth) files, and PSF files used in the ABRF Light Microscopy Research Group (LMRG) image analysis study. Simulated 3D widefield fluorescence images of nuclei.

See ABRFLMRG/image-analysis-study for more details.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/6560759

https://doi.org/10.5281/zenodo.6560759

LyNSeC: Lymphoma Nuclear Segmentation and Classification#

Naji Hussein, Büttner Reinhard, Simon Adrian, Eich Marie-Lisa, Lohneis Philipp, Bozek Katarzyna

Published 2023-06-21

Licensed CC-BY-4.0

Over the last years, there has been large progress in automated segmentation and classification methods in histological whole slide images (WSIs) stained with hematoxylin and eosin (H&E). Current state-of-the-art techniques are based on diverse datasets of H&E-stained WSIs of different types of predominantly solid cancer. However, there is a lack of publicly available annotated datasets of lymphoma, which is why we generated a labeled diffuse large B-cell lymphoma dataset and denoted it LyNSeC (lymphoma nuclear segmentation and classification). LyNSeC comprises three subsets: LyNSeC 1 consists of 379 IHC images of size 512 x 512 pixels at 40x magnification. In the images, we annotated the contours of each cell nuclei and the cell class: marker-positive or marker-negative.

In total, LyNSeC 1 contains 87,316 annotated cell nuclei of four different cases, with 48,171 of them assigned the class negative and 39,145 positive. We included three markers in this dataset showing visually different staining patterns: cluster of differentiation 3 (CD3), Ki67 as a marker of proliferation, and erythroblast transformation-specific (EST)-related gene (ERG).

LyNSeC 2 and 3 contain H&E-stained images of 70 different patients. LyNSeC 2 consists of 280 images and LyNSeC 3 of 40 images of size 512 x 512 pixels at 40x magnification. 65,479 and 8,452 nuclei were annotated in LyNSeC 2 and 3, respectively. In LyNSeC 3, the nuclei were also assigned a class label (tumor and non-tumor). 3,747 nuclei were identified as tumors and 4,705 as non-tumors.

In the annotation procedure, the contours of the H&E images (LyNSeC 2 and LyNSeC 3) were annotated by two pathologists and by two students (trained by the pathologists). Annotation of the cell classes in LyNSeC 3 was done by the pathologists only. LyNSeC 1 was annotated by the two students who were additionally trained to annotate the contours and to distinguish marker-positive and marker-negative cells. The pathologists inspected and (if necessary) adjusted the LyNSeC 3 annotations.

The files are uploaded in ‘.npy’ format. The files of LyNSeC 1 (x_l1.npy) and LyNSeC 3 (x_l3.npy) contain five channels, respectively: the first three are the RGB channels of the images, channel 4 contains the instance maps, and channel 5 the class type maps (for LyNSeC 1 a pixel value of 1 corresponds to the class negative and 2 to the class positive, whereas in LyNSeC 3 1 corresponds to the class non-tumor and 2 to the class tumor). The files of LyNSeC 2 (x_l2.npy) have 4 channels (without the class type map).

Additionally, we also make our HoVer-Net-based pre-trained nuclei segmentation and classification models available (he.tar for H&E images and ihc.tar for IHC images).

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/8065174

https://doi.org/10.5281/zenodo.8065174

MIDOG 2021#

Marc Aubreville, Frauke Wilm

Published 2021-03-16

Licensed UNLICENSED

Mitosis domain generation. Here you can find code of our own evaluations and a dockered reference algorithm for mitotic figures to use as a template.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

DeepMicroscopy/MIDOG

Melanoma Histopathology Dataset with Tissue and Nuclei Annotations#

Mark Schuiveling

Published 2025-03-19

Licensed CC-ZERO

Description: This dataset is designed for development of deep learning models for segmentation of nuclei and tissue in melanoma H&E stained histopathology. Existing nuclei segmentation models that are trained on non-melanoma specific datasets have low performance due to the ability of melanocytes to mimic other cell types, whereas existing melanoma specific models utilize older, sub-optimal techniques. Moreover, these models do not provide tissue annotations necessary for determining the localization of tumor-infiltrating lymphocytes, which may hold value for predictive and prognostic tasks. To address this, we created a melanoma specific dataset with nuclei and tissue annotations. Methodology: Sample Collection: Regions of interest (ROIs) were sampled from H&E stained slides of 103 primary melanoma specimens and 102 metastatic melanoma specimens, scanned using a Hamamatsu scanner at 40× magnification (0.23 μm per pixel). All slides were obtained from regular diagnostic procedures.From each specimen, a 40× magnified ROI of 1024×1024 pixels was selected for annotation. Additionally, a context ROI of 5120×5120 pixels was sampled to provide information about the broader context for the annotation process. Selection was performed by a trained medical expert (M.S.) and subsequently verified by a dermatopathologist (W.B.). Manual ROI selection ensured the inclusion of diverse tissue and nuclei types. Annotation Process:

Nuclei segmentationNuclei segmentations were generated using Hover-Net pretrained on the PanNuke dataset. Manual annotation adjustments were performed by author M.S. using QuPath, with the following nuclei categories: tumor, stroma, vascular endothelium, histiocyte, melanophage, lymphocyte, plasma cell, neutrophil, apoptotic cell, and epithelium. All annotations were reviewed and corrected, where needed, by a dermatopathologist (W.B.). Tissue segmentationTissue segmentations were created manually using QuPath by M.S., with the following categories: tumor, stroma, epidermis, necrosis, blood vessel, and background. Annotations were reviewed and corrected, where needed, by a dermatopathologist (W.B.).

Quality Control: To assess the reliability of the annotations, intra- and interobserver agreement (by pathologist G.B.) were determined on 12 randomly selected ROIs.

Nuclei segmentationThe intraobserver overall precision was 84.89%, with a recall of 86.45%, and an F1 score of 85.66%. Interobserver overall precision was 80.34%, with a recall of 80.62%, and an F1 score of 80.20%. These results are based on the sum of all true positive, false positive, and false negative counts for the 12 ROIs. Tissue segmentationThe DICE score was determined on the same 12 randomly selected ROIs. The average intraobserver DICE score was 0.90, and the interobserver DICE score was also 0.90.

Version 3:Removed sample “training_set_metastatic_roi_103” due to inconsistencies in annotation file. Version 4:Sample training_set_metastatic_roi_088 missed one color annotation for a nuclei_apoptosis in the geojson file rendering it qupath uncompatible. This is fixed in the new version. Version 5:Addition of correct sample of training_set_metastatic_roi_103” after deadline of panoptic segmentation of nuclei and tissue in advanced melanoma challenge test phase.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/15050523

https://doi.org/10.5281/zenodo.15050523

MemBrain-seg training data#

Lorenz Lamm

Published 2023-03-16

Licensed CC-BY-4.0

This dataset contains training data for segmenting membranes in cryo-electron tomograms.

More details will follow.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/7739793

https://doi.org/10.5281/zenodo.7739793

MoNuSeg Dataset#

Neeraj Kumar, Ruchika Verma, Sanuj Sharma, Surabhi Bhargava, Abhishek Vahadane, Amit Sethi

Published 2017-07-01

Licensed CC-BY-NC-SA-4.0

The dataset for this challenge was obtained by carefully annotating tissue images of several patients with tumors of different organs and who were diagnosed at multiple hospitals. This dataset was created by downloading H&E stained tissue images captured at 40x magnification from TCGA archive. H&E staining is a routine protocol to enhance the contrast of a tissue section and is commonly used for tumor assessment (grading, staging, etc.). Given the diversity of nuclei appearances across multiple organs and patients, and the richness of staining protocols adopted at multiple hospitals, the training datatset will enable the development of robust and generalizable nuclei segmentation techniques that will work right out of the box.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://monuseg.grand-challenge.org/Data/

MonuSAC 2020#

Ruchika Verma, Neeraj Kumar, Abhijeet Patil, Nikhil Cherian Kurian, Swapnil Rane, Simon Graham

Published 2021-06-04

Licensed CC-BY-NC-SA-4.0

H&E staining of human tissue sections is a routine and most common protocol used by pathologists to enhance the contrast of tissue sections for tumor assessment (grading, staging, etc.) at multiple microscopic resolutions. Hence, we will provide the annotated dataset of H&E stained digitized tissue images of several patients acquired at multiple hospitals using one of the most common 40x scanner magnification. The annotations will be done with the help of expert pathologists.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://monusac-2020.grand-challenge.org/Data/

Mouse embryo blastocyst cells#

Vebjorn Ljosa, Katherine L. Sokolnicki, Anne E. Carpenter

Published 2012-06-28

Licensed CC0-1.0

Segmenting nuclei in 3D images can be challenging especially when nuclei are clustered not only in XY plane but also in XZ and YZ planes. Manually annotated ground truth provides a reference for image analysis software testing purposes. These images of mouse embryo blastocyst cells also have changing nuclei intensity in Z plane which makes finding the right threshold for successful segmentation a difficult task. This image set also contains GAPDH transcripts that can be quantified in each cell.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC032

NeurIPS 2022 Cell Segmentation Competition Dataset#

Jun Ma, Ronald Xie, Shamini Ayyadhury, Cheng Ge, Anubha Gupta, Ritu Gupta, Song Gu, Yao Zhang, Gihun Lee, Joonkee Kim, Wei Lou, Haofeng Li, Eric Upschulte, Timo Dickscheid, de Almeida, José Guilherme, Yixin Wang, Lin Han, Xin Yang, Marco Labagnara, Vojislav Gligorovski, Maxime Scheder, Rahi, Sahand Jamal, Carly Kempster, Alice Pollitt, Leon Espinosa, Tam Mignot, Middeke, Jan Moritz, Jan-Niklas Eckardt, Wangkai Li, Zhaoyang Li, Xiaochen Cai, Bizhe Bai, Greenwald, Noah F., Van Valen, David, Erin Weisbart, Cimini, Beth A, Trevor Cheung, Oscar Brück, Bader, Gary D., Bo Wang

Published 2024-02-27

Licensed CC-BY-NC-ND-4.0

The official data set for the NeurIPS 2022 competition: cell segmentation in multi-modality microscopy images. https://neurips22-cellseg.grand-challenge.org/ Please cite the following paper if this dataset is used in your research. @article{NeurIPS-CellSeg, title = {The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions}, author = {Jun Ma and Ronald Xie and Shamini Ayyadhury and Cheng Ge and Anubha Gupta and Ritu Gupta and Song Gu and Yao Zhang and Gihun Lee and Joonkee Kim and Wei Lou and Haofeng Li and Eric Upschulte and Timo Dickscheid and José Guilherme de Almeida and Yixin Wang and Lin Han and Xin Yang and Marco Labagnara and Vojislav Gligorovski and Maxime Scheder and Sahand Jamal Rahi and Carly Kempster and Alice Pollitt and Leon Espinosa and Tâm Mignot and Jan Moritz Middeke and Jan-Niklas Eckardt and Wangkai Li and Zhaoyang Li and Xiaochen Cai and Bizhe Bai and Noah F. Greenwald and David Van Valen and Erin Weisbart and Beth A. Cimini and Trevor Cheung and Oscar Brück and Gary D. Bader and Bo Wang}, journal = {Nature Methods}, volume={21}, pages={1103–1113}, year = {2024}, doi = {https://doi.org/10.1038/s41592-024-02233-6} } This is an instance segmentation task where each cell has an individual label under the same category (cells). The training set contains both labeled images and unlabeled images. You can only use the labeled images to develop your model but we encourage participants to try to explore the unlabeled images through weakly supervised learning, semi-supervised learning, and self-supervised learning. The images are provided with original formats, including tiff, tif, png, jpg, bmp… The original formats contain the most amount of information for competitors and you have free choice over different normalization methods. For the ground truth, we standardize them as tiff formats. We aim to maintain this challenge as a sustainable benchmark platform. If you find the top algorithms (https://neurips22-cellseg.grand-challenge.org/awards/) don’t perform well on your images, welcome to send us the dataset (neurips.cellseg@gmail.com)! We will include them in the new testing set and credit your contributions on the challenge website! Dataset License: CC-BY-NC-ND

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/10719375

https://doi.org/10.5281/zenodo.10719375

NuInsSeg#

Amirreza Mahbod, Christine Polak, Katharina Feldmann, Rumsha Khan, Katharina Gelles, Georg Dorffner, Ramona Woitek, Sepideh Hatamikia, Isabella Ellinger

Published 2024-05-14

Licensed CC-BY-4.0

A Fully Annotated Dataset for Nuclei Instance Segmentation in H&E-Stained Images

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.kaggle.com/datasets/ipateam/nuinsseg

Nuclei of U2OS cells in a chemical screen#

Vebjorn Ljosa, Katherine L. Sokolnicki, Anne E. Carpenter

Published 2012-06-28

Licensed CC0-1.0

This image set is part of a high-throughput chemical screen on U2OS cells, with examples of 200 bioactive compounds. The effect of the treatments was originally imaged using the Cell Painting assay (fluorescence microscopy). This data set only includes the DNA channel of a single field of view per compound. These images present a variety of nuclear phenotypes, representative of high-throughput chemical perturbations. The main use of this data set is the study of segmentation algorithms that can separate individual nucleus instances in an accurate way, regardless of their shape and cell density. The collection has around 23,000 single nuclei manually annotated to establish a ground truth collection for segmentation evaluation.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC039

Nuclei of mouse embryonic cells#

Vebjorn Ljosa, Katherine L. Sokolnicki, Anne E. Carpenter

Published 2012-06-28

Licensed CC0-1.0

Cell dynamics during the early mouse embryogenesis change spatiotemporally. For understanding the mechanism of this developmental process, imaging cell dynamics by live-cell imaging of fluorescently labeled nuclei and performing nuclei segmentation of these images by image processing are essential. This dataset contains the fluorescence images and Ground Truth used when performing nuclei segmentation using deep learning. Fluorescence images are time-series images from fertilization to blastocyst formation. Ground Truth is supervised data of the cell nuclear region.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC050

OCELOT: Overlapped Cell on Tissue Dataset for Histopathology#

Jeongun Ryu, Aaron Valero Puche, JaeWoong Shin, Seonwook Park, Biagio Brattoli, Mohammad Mostafavi, Jinhee Lee, Sérgio Pereira, Wonkyung Jung, Soo Ick Cho, Chan-Young Ock, Kyunghyun Paeng, Donggeun Yoo

Published 2023-03-23

The OCELOT dataset is a histopathology dataset designed to facilitate the development of methods that utilize cell and tissue relationships. The dataset comprises both small and large field-of-view (FoV) patches extracted from digitally scanned whole slide images (WSIs), with overlapping regions. The small and large FoV patches are accompanied by annotations of cells and tissues, respectively. The WSIs are sourced from the publicly available TCGA database and were stained using the H&E method before being scanned with an Aperio scanner.

For more details, please check https://lunit-io.github.io/research/ocelot_dataset/.

Before downloading the dataset, please make sure to carefully read and agree to the Terms and Conditions at (https://lunit-io.github.io/research/ocelot_tc/).

Also, please provide 1. name, 2. e-mail address, 3. organization/company name.

Release note.

In version 1.0.1, we exclude four test cases (586, 589, 609, 615) due to under-annotated issue. In version 1.0.0, we include images and annotations of validation and test splits. In version 0.1.2, we modified the coordinates of cell labels to range from 0 to 1023 (-1 from the previous coordinates). In version 0.1.1, we removed non-H&E stained patches from the dataset.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/8417503

https://doi.org/10.5281/zenodo.8417503

Parhyale 3D segmentation dataset#

Frederike Alwes, Ko Sugawara, Michalis Averof

Published 2023-08-11

Licensed CC-BY-4.0

The Parhyale 3D Segmentation dataset consists of 50 timepoints (TP01-TP50) of 3D images (512x512x34), where the manual annotations can be found at discrete 6 timepoints (at TP01, TP11, TP21, TP31, TP41 and TP50).

For further details, see README file.

This version fixes the duplicated label IDs found in the previous version of label files. This version ensures that each instance has a unique ID. Thanks to Jackson Borchardt for reporting that error.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/8252039

https://doi.org/10.5281/zenodo.8252039

Platynereis EM training data#

Constantin Pape

Published 2020-02-19

Licensed CC-BY-4.0

Training data for Convolutional Neural Networks used in the publication Whole-body integration of gene expression and single-cell morphology. We provide training data for segmenting structures in the SerialBlockface Electron Microscopy data-set containing a complete 6 day old Platynereis dumerilii larva, in particular for:

cell membranes: 9 training blocks @ resolution 20x20x25 nm. Based on initial training data provided by https://ariadne.ai/.
cilia: 3 training and 2 validation blocks @ resolution 20x20x25 nm.
cuticle: 5 training blocks @ resolution 40x40x50 nm.
nuclei: 12 training blocks @ resolution 80x80x100 nm. Based on initial training data provided by https://ariadne.ai/.

For details on how to use this data for training, see platybrowser/platybrowser-backend.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/3675220

https://doi.org/10.5281/zenodo.3675220

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides#

Wenqi Tang, MIC Group

Published 2021-12-12

Licensed UNLICENSED

This repo is the official implementation of our paper “Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides”.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

bupt-ai-cz/BALNMP

ProdgerLab-StarDist-HIV Target Cell Training Set#

Zhongtian Shao

Published 2023-06-28

Licensed CC-BY-4.0

40 annotated immunofluorescence microscopy images (600 microns x 600 microns) of foreskin tissue stained for CD3/CD4/CCR5/Nuclei. These images were used to train StarDist models used for the identification of HIV Target Cells in foreskin tissue section scans.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/8091914

https://doi.org/10.5281/zenodo.8091914

Root tissue segmentation dataset#

Julian Wanner, Kuhn Cuellar, Luis, Friederike Wanke

Published 2022-01-12

Licensed CC-BY-4.0

The PHDFM dataset is composed of fluorescence microscopy images of root tissue samples from A. thaliana, using the ratiometric fluorescent indicator 8‐hydroxypyrene‐1,3,6‐trisulfonic acid trisodium salt (HPTS). This semantic segmentation training dataset consists of 2D microscopy images (the brightfield channel for excitation at 405 nm), each containing a segmentation mask as an additional image channel (manually annotated by plant biologists). The segmentation masks classify pixels into the following 5 labels with the corresponding IDs: background (0), root tissue (1), early elongation zone (2), late elongation zone (3), and meristematic zone (4).

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/5841376

https://doi.org/10.5281/zenodo.5841376

Segmentation of Nuclei in Histopathology Images by deep regression of the distance map#

Naylor Peter Jack, Walter Thomas, Laé Marick, Reyal Fabien

Published 2018-02-16

Licensed CC-BY-4.0

This dataset has been annonced in our accepted paper “Segmentation of Nuclei in Histopathology Images by deep regression of the distance map” in Transcation on Medical Imaging on the 13th of August. This dataset consists of 50 annotated images, divided into 11 patients.

Tags: Nuclei Images, Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/1175282#.WyP61xy-l5E

Segmenting cells in a spheroid in 3D using 2D StarDist within TrackMate#

Jean-Yves Tinevez, Joanna W. Pylvänäinen, Guillaume Jacquemet

Published 2021-08-19

Licensed CC-BY-4.0

3D image of cells in a spheroid, imaged on a confocal microscope, used in a tutorial to demonstrate how to hack TrackMate to segment cells in 3D using the 2D segmentation algorithms it ships.

Image by Guillaume Jacquemet.

For more details see https://imagej.net/plugins/trackmate/trackmate-stardist#generation-of-3d-labels-by-tracking-2d-labels-using-trackmate

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/5220610

https://doi.org/10.5281/zenodo.5220610

Simulated HL60 cells (from the Cell Tracking Challenge)#

Vebjorn Ljosa, Katherine L. Sokolnicki, Anne E. Carpenter

Published 2012-06-28

Licensed CC0-1.0

These are synthetic images from the Cell Tracking Challenge. The images depict simulated nuclei of HL60 cells stained with Hoescht (training datasets). These synthetic images of HL60 cells provide an opportunity to test image analysis software by comparing segmentation results to the available ground truth for each time point. The number of clustered nuclei increases with time adding more complexity to the problem. This time-laps dataset can be used for simple segmentation or for nuclei tracking.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC035

Single-cell approach dissecting agr quorum sensing dynamics in Staphylococcus aureus#

Julian Bär

Published 2024-02-28

Licensed CC-BY-4.0

Training data for the two StarDist2D models and the DeLTA 2.0 2D tracking model used in the publication on bioarxiv. The trained stardist models are included in the respective zip files of the training data. mm: mother-machine; cc: connected chamber. Each of them contains two folders, img and seg_label. They contain matching pairs of phasecontrast images (img) and label images (seg_label). tracking_set_subset.zip contains the training data for the DeLTA tracking model following the default folder structure. We used custom weight functions to create the training weight maps in the folder wei. The folder wei_bck contains weights generated with the original function. The unet_pads_tracking.hdf5 is the retrained tracking model used in the associated publication. See associated GitHub repository for example code on how to use the models for segmentation and tracking. The four numbered zip files contain the data used to create all figures displaying image analysis output. Abstract: Staphylococcus aureus both colonizes humans and causes severe virulent infections. Virulence is regulated by the agr quorum sensing system and its autoinducing peptide (AIP), with dynamics at the single-cell level across four agr-types – each defined by distinct AIP sequences and capable of cross-inhibition – remaining elusive. Employing microfluidics, time-lapse microscopy, and deep-learning image analysis, we uncovered significant differences in AIP sensitivity among agr-types. We observed bimodal agr activation, attributed to intergenerational phenotypic stability and influenced by AIP concentration. Upon AIP stimulation, agr‑III showed AIP insensitivity, while agr‑II exhibited increased sensitivity and prolonged generation time. Beyond expected cross-inhibition of agr‑I by heterologous AIP‑II and ‑III, the presumably cross-activating AIP‑IV also inhibited agr‑I. Community interactions across different agr-type pairings revealed four main patterns: stable or switched dominance, and delayed or stable dual activation, influenced by community characteristics. These insights underscore the potential of personalized treatment strategies considering virulence and genetic diversity.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/10720439

https://doi.org/10.5281/zenodo.10720439

StarDist Adipocyte Segmentation Training data, Training Notebook and Model#

Sarkis Rita, Naveiras Olaia, Burri Olivier, Weigert Martin, De Leval Laurence

Published 2022-08-17

Licensed CC-BY-4.0

Data from H&E human bone marrow whole slide scanner images used in the paper: “MarrowQuant 2.0: a digital pathology workflow assisting bone marrow evaluation in clinical and experimental hematology” (https://doi.org/10.21203/rs.3.rs-1860140/v1)

292 image patches

Ground truth were manually annotated using QuPath and split into 263 images for training and 29 for validation.

Training in StarDist was done on a Windows 10 PC with an RTX 2080 GPU. The requirements file for installing a Python 3.7 environment to run the attached notebooks is provided (stardist-val.txt).

The StarDist model configuration can be found in the Jupyter Notebook :

Adipocyte Training.ipynb

Model validation and metrics can be performed by running the notebook after finishing the Adipocyte Training notebook.

Quality Control.ipynb

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/7003909

https://doi.org/10.5281/zenodo.7003909

StarDist model and data for the segmentation of Yersinia enterocolitica cells in widefield images#

Christoph Spahn, Andreas Diepold, Francesca Ermoli

Published 2024-05-02

Licensed CC-BY-4.0

Dataset and StarDist model for the segmentation of Yersinia enterocolitica cells This dataset and StarDist model are part of the publication “Active downregulation of the type III secretion system at higher local cell densities promotes Yersinia replication and dissemination”. It contains the dataset that was used for training the provided StarDist model using ZeroCostDL4Mic. Data: Yersinia enterocolitica cells were spotted on an agarose pad (1.5% low melting agarose (Sigma-Aldrich) in minimal medium, 1% Casamino acids, 5 mM EGTA, glass depression slides (Marienfeld)). For imaging, a Deltavision Elite Optical Sectioning Microscope equipped with a UPlanSApo 100×/1.40 oil objective (Olympus) and an EDGE sCMOS_5.5 camera (Photometrics) was used. Z-stacks with 9 slices (∆z = 0.15 µm) per fluorescence channel were acquired and 5 slices were selected for network training. Images were annotated in Fiji using the Freehand selection tool, and brightlight and mask images were quartered to obtain the final dataset of 300 paired images. 260 images were used for training, while 40 images were used to test model performance. Model: The StarDist 2D model was trained from scratch for 100 epochs on 300 paired image patches (image dimensions: (480 x 480 px²), patch size: (480 x 480 px²)) with a batch size of 4 and a mae loss function, using the StarDist 2D ZeroCostDL4Mic notebook (v 1) (von Chamier & Laine et al., 2020). Grid parameter was set to 2 and the number of rays to 120. The model was trained with an initial learning rate of 0.0003 using a 80/20 train/test split. The dataset was augmented 4-fold by flipping and rotation. Key python packages used include tensorflow (v 0.1.12), Keras (v2.3.1), csbdeep (v 0.7.2), numpy (v 1.21.6), cuda (v 11.1.105Build cuda_11.1.TC455_06.29190527_0). The training was accelerated using a Tesla T4 GPU.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/11105050

https://doi.org/10.5281/zenodo.11105050

StarDist_AsPC1_Lifeact#

Gautier Follain, Sujan Ghimire, Joanna Pylvänäinen, Johanna Ivaska, Guillaume Jacquemet

Published 2024-08-29

Licensed CC-BY-4.0

This repository includes a StarDist deep learning model designed for segmenting AsPC1 cells labeled with Lifeact from fluorescence microscopy images. The model distinguishes individual AsPC1 cells within clusters and separates them from the background. The model was trained on a small dataset and achieved an Intersection over Union (IoU) score of 0.884 and an F1 Score of 0.967, indicating high accuracy in cell segmentation. Specifications

Model: StarDist for segmenting AsPC1 cells in fluorescence microscopy images

Training Dataset:

Number of Images: 10 paired fluorescence microscopy images and label masks

Microscope: Spinning disk confocal microscope (3i CSU-W1) with a 20x objective, NA 0.8

Data Type: Fluorescence microscopy images of the AsPC1 Lifeact channel with manually segmented masks

File Format: TIFF (.tif)

Fluorescence Images: 16-bit

Masks: 8-bit

Image Size: 1024 x 1024 pixels (Pixel size: 0.6337 x 0.6337 µm²)

Model Capabilities:

Segment AsPC1 Cells: Detects individual AsPC1 cells from a cluster and separates them from the background

Measure Intensity: Enables measurement of CD44, ICAM1, ICAM2, or Fibronectin intensity under individual cells in respective channels

Performance:

Average IoU: 0.884

Average F1 Score: 0.967

Model Training: Conducted using ZeroCostDL4Mic (HenriquesLab/ZeroCostDL4Mic)

Reference Fast label-free live imaging reveals key roles of flow dynamics and CD44-HA interaction in cancer cell arrest on endothelial monolayers

Gautier Follain, Sujan Ghimire, Joanna W. Pylvänäinen, Monika Vaitkevičiūtė, Diana Wurzinger, Camilo Guzmán, James RW Conway, Michal Dibus, Sanna Oikari, Kirsi Rilla, Marko Salmi, Johanna Ivaska, Guillaume Jacquemet bioRxiv 2024.09.30.615654; doi: https://doi.org/10.1101/2024.09.30.615654

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/13442128

https://doi.org/10.5281/zenodo.13442128

StarDist_BF_Monocytes_dataset#

Gautier Follain, Sujan Ghimire, Joanna Pylvänäinen, Johanna Ivaska, Guillaume Jacquemet

Published 2024-01-26

Licensed CC-BY-4.0

This repository includes a StarDist deep learning model and its training and validation datasets for detecting mononucleated cells perfused over an endothelial cell monolayer. The model was trained on 27 manually annotated images and achieved an average F1 Score of 0.941. The dataset and model are helpful for biomedical research, especially in studying interactions between mononucleated and endothelial cells. Specifications

Model: StarDist for mononucleated cell detection on endothelial cells

Training Dataset:

Number of Images: 27 paired brightfield microscopy images and label masks

Microscope: Nikon Eclipse Ti2-E, 20x objective

Data Type: Brightfield microscopy images with manually segmented masks

File Format: TIFF (.tif)

Brightfield Images: 16-bit

Masks: 8-bit

Image Size: 1024 x 1022 pixels (Pixel size: 650 nm)

Training Parameters:

Epochs: 400

Patch Size: 992 x 992 pixels

Batch Size: 2

Performance:

Average F1 Score: 0.941

Average IoU: 0.831

Model Training: Conducted using ZeroCostDL4Mic (HenriquesLab/ZeroCostDL4Mic)

Reference Fast label-free live imaging reveals key roles of flow dynamics and CD44-HA interaction in cancer cell arrest on endothelial monolayers Gautier Follain, Sujan Ghimire, Joanna W. Pylvänäinen, Monika Vaitkevičiūtė, Diana Wurzinger, Camilo Guzmán, James RW Conway, Michal Dibus, Sanna Oikari, Kirsi Rilla, Marko Salmi, Johanna Ivaska, Guillaume Jacquemet bioRxiv 2024.09.30.615654; doi: https://doi.org/10.1101/2024.09.30.615654

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/10572200

https://doi.org/10.5281/zenodo.10572200

StarDist_BF_Neutrophil_dataset#

Gautier Follain, Sujan Ghimire, Joanna Pylvänäinen, Johanna Ivaska, Guillaume Jacquemet

Published 2024-01-26

Licensed CC-BY-4.0

This repository includes a StarDist deep learning model and its training and validation datasets for detecting neutrophils perfused over an endothelial cell monolayer. The model was trained on 36 manually annotated images, achieving an average F1 Score of 0.969. The dataset and model are intended for use in biomedical research, particularly for analyzing interactions between neutrophils and endothelial cells. Specifications

Model: StarDist for neutrophil detection on endothelial cells

Training Dataset:

Number of Images: 36 paired brightfield microscopy images and label masks

Microscope: Nikon Eclipse Ti2-E, 20x objective

Data Type: Brightfield microscopy images with manually segmented masks

File Format: TIFF (.tif)

Brightfield Images: 16-bit

Masks: 8-bit

Image Size: 1024 x 1022 pixels (Pixel size: 650 nm)

Training Parameters:

Epochs: 400

Patch Size: 992 x 992 pixels

Batch Size: 2

Performance:

Average F1 Score: 0.969

Average IoU: 0.914

Model Training: Conducted using ZeroCostDL4Mic (HenriquesLab/ZeroCostDL4Mic)

Reference Fast label-free live imaging reveals key roles of flow dynamics and CD44-HA interaction in cancer cell arrest on endothelial monolayers

Gautier Follain, Sujan Ghimire, Joanna W. Pylvänäinen, Monika Vaitkevičiūtė, Diana Wurzinger, Camilo Guzmán, James RW Conway, Michal Dibus, Sanna Oikari, Kirsi Rilla, Marko Salmi, Johanna Ivaska, Guillaume Jacquemet bioRxiv 2024.09.30.615654; doi: https://doi.org/10.1101/2024.09.30.615654

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/10572231

https://doi.org/10.5281/zenodo.10572231

StarDist_BF_cancer_cell_dataset_10x#

Gautier Follain, Sujan Ghimire, Joanna Pylvänäinen, Johanna Ivaska, Guillaume Jacquemet

Published 2024-08-12

Licensed CC-BY-4.0

This repository includes a StarDist deep learning model and its training dataset designed for segmenting cancer cells perfused over an endothelial cell monolayer captured at 10x magnification. The model was trained on 77 manually annotated images, with the dataset being computationally augmented during training by a factor of 8. The model was trained for 500 epochs and achieved an average F1 Score of 0.968, indicating high accuracy in segmenting cancer cells on endothelial cells. Specifications

Model: StarDist for cancer cell segmentation on endothelial cells (10x magnification)

Training Dataset:

Number of Images: 77 paired brightfield microscopy images and label masks

Augmented Dataset: Computational augmentation by a factor of 8 during training

Microscope: Nikon Eclipse Ti2-E, 10x objective

Data Type: Brightfield microscopy images with manually segmented masks

File Format: TIFF (.tif)

Brightfield Images: 16-bit

Masks: 8-bit or 16-bit

Image Size: 1024 x 1022 pixels (pixel size: 1.3148 μm)

Training Parameters:

Epochs: 500

Patch Size: 992 x 992 pixels

Batch Size: 2

Performance:

Average F1 Score: 0.968

Average IoU: 0.882

Model Training: Conducted using ZeroCostDL4Mic (HenriquesLab/ZeroCostDL4Mic)

Reference Fast label-free live imaging reveals key roles of flow dynamics and CD44-HA interaction in cancer cell arrest on endothelial monolayers

Gautier Follain, Sujan Ghimire, Joanna W. Pylvänäinen, Monika Vaitkevičiūtė, Diana Wurzinger, Camilo Guzmán, James RW Conway, Michal Dibus, Sanna Oikari, Kirsi Rilla, Marko Salmi, Johanna Ivaska, Guillaume Jacquemet bioRxiv 2024.09.30.615654; doi: https://doi.org/10.1101/2024.09.30.615654

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/13304399

https://doi.org/10.5281/zenodo.13304399

StarDist_BF_cancer_cell_dataset_20x#

Gautier Follain, Sujan Ghimire, Joanna Pylvänäinen, Johanna Ivaska, Guillaume Jacquemet

Published 2024-01-26

Licensed CC-BY-4.0

This repository contains a StarDist deep learning model and its training and validation datasets designed for segmenting cancer cells perfused over an endothelial cell monolayer captured at 20x magnification. Using computational methods, the initial dataset of 20 manually annotated images was augmented to 160 paired images. The model was trained over 400 epochs and achieved an average F1 Score of 0.921, demonstrating high accuracy in cell segmentation tasks. Specifications

Model: StarDist for cancer cell segmentation on endothelial cells (20x magnification)

Training Dataset:

Number of Original Images: 20 paired brightfield microscopy images and label masks

Microscope: Nikon Eclipse Ti2-E, 20x objective

Data Type: Brightfield microscopy images with manually segmented masks

File Format: TIFF (.tif)

Brightfield Images: 16-bit

Masks: 8-bit

Image Size: 1024 x 1022 pixels (Pixel size: 650 nm)

Training Parameters:

Epochs: 400

Patch Size: 992 x 992 pixels

Batch Size: 2

Performance:

Average F1 Score: 0.921

Average IoU: 0.793

Model Training: Conducted using ZeroCostDL4Mic (HenriquesLab/ZeroCostDL4Mic)

Reference Fast label-free live imaging reveals key roles of flow dynamics and CD44-HA interaction in cancer cell arrest on endothelial monolayers

Gautier Follain, Sujan Ghimire, Joanna W. Pylvänäinen, Monika Vaitkevičiūtė, Diana Wurzinger, Camilo Guzmán, James RW Conway, Michal Dibus, Sanna Oikari, Kirsi Rilla, Marko Salmi, Johanna Ivaska, Guillaume Jacquemet bioRxiv 2024.09.30.615654; doi: https://doi.org/10.1101/2024.09.30.615654

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/10572122

https://doi.org/10.5281/zenodo.10572122

StarDist_Fluorescent_cells#

Gautier Follain, Sujan Ghimire, Joanna Pylvänäinen, Johanna Ivaska, Guillaume Jacquemet

Published 2024-01-26

Licensed CC-BY-4.0

This repository includes a StarDist deep learning model and its training and validation datasets for detecting fluorescently labeled cancer cells perfused over an endothelial cell monolayer. The model was trained on 66 images labeled with CellTrace and demonstrated high accuracy, achieving an average F1 Score of 0.877. The dataset and the trained model can be used for biomedical image analysis, particularly in cancer research. Specifications

Model: StarDist for cancer cell detection

Training Dataset:

Number of Images: 66 paired fluorescent microscopy images and label masks

Microscope: Nikon Eclipse Ti2-E, 10x objective

Data Type: Fluorescent microscopy images with manually segmented masks

File Format: TIFF (.tif)

Brightfield Images: 16-bit

Masks: 8-bit

Image Size: 1024 x 1024 pixels (Pixel size: 1.3205 μm)

Training Parameters:

Epochs: 200

Patch Size: 1024 x 1024 pixels

Batch Size: 2

Performance:

Average F1 Score: 0.877

Average IoU: 0.646

Model Training: Conducted using ZeroCostDL4Mic (HenriquesLab/ZeroCostDL4Mic)

Reference Fast label-free live imaging reveals key roles of flow dynamics and CD44-HA interaction in cancer cell arrest on endothelial monolayers

Gautier Follain, Sujan Ghimire, Joanna W. Pylvänäinen, Monika Vaitkevičiūtė, Diana Wurzinger, Camilo Guzmán, James RW Conway, Michal Dibus, Sanna Oikari, Kirsi Rilla, Marko Salmi, Johanna Ivaska, Guillaume Jacquemet bioRxiv 2024.09.30.615654; doi: https://doi.org/10.1101/2024.09.30.615654

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/10572310

https://doi.org/10.5281/zenodo.10572310

StarDist_HUVEC_nuclei_dataset#

Gautier Follain, Sujan Ghimire, Joanna Pylvänäinen, Johanna Ivaska, Guillaume Jacquemet

Published 2024-02-05

Licensed CC-BY-4.0

This repository contains a StarDist deep learning model and its training and validation datasets for segmenting endothelial nuclei while ignoring cancer cells. The cancer cells were perfused over an endothelial cell monolayer. The initial dataset consisted of 17 images, where cancer cell nuclei were manually removed after segmentation with the StarDist Versatile Nuclei model. This dataset was augmented to 68 paired images using computational techniques like rotation and flipping. The model was trained for 200 epochs, achieving an average F1 Score of 0.976, demonstrating high accuracy in segmenting endothelial nuclei while excluding cancer cells. Specifications

Model: StarDist for segmenting endothelial nuclei while ignoring cancer cells

Training Dataset:

Number of Original Images: 17 paired predictions of nuclei and label images

Augmented Dataset: Expanded to 68 paired images using rotation and flipping

Source Image Generation: Generated using a pix2pix model trained to predict nuclei from brightfield images of cancer cells on top of an endothelium (DOI: 10.5281/zenodo.10617532)

Target Image Generation: Masks obtained via manual segmentation

File Format: TIFF (.tif)

Brightfield Images: 8-bit

Masks: 8-bit

Image Size: 1024 x 1022 pixels (uncalibrated)

Training Parameters:

Epochs: 200

Patch Size: 1024 x 1024 pixels

Batch Size: 2

Performance:

Average F1 Score: 0.976

Average IoU: 0.927

Model Training: Conducted using ZeroCostDL4Mic (HenriquesLab/ZeroCostDL4Mic)

Reference Fast label-free live imaging reveals key roles of flow dynamics and CD44-HA interaction in cancer cell arrest on endothelial monolayers

Gautier Follain, Sujan Ghimire, Joanna W. Pylvänäinen, Monika Vaitkevičiūtė, Diana Wurzinger, Camilo Guzmán, James RW Conway, Michal Dibus, Sanna Oikari, Kirsi Rilla, Marko Salmi, Johanna Ivaska, Guillaume Jacquemet bioRxiv 2024.09.30.615654; doi: https://doi.org/10.1101/2024.09.30.615654

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/10617532

https://doi.org/10.5281/zenodo.10617532

StarDist_TumorCell_nuclei#

Gautier Follain, Sujan Ghimire, Joanna Pylvänäinen, Johanna Ivaska, Guillaume Jacquemet

Published 2024-08-29

Licensed CC-BY-4.0

This repository contains a StarDist deep learning model designed for segmenting tumor cell nuclei from the DAPI channel in fluorescence microscopy images while excluding HUVEC nuclei. The model was trained to accurately detect individual tumor cell nuclei for subsequent measurement of CD44, ICAM1, ICAM2, or Fibronectin intensity around or under the nuclei. The model achieved an Intersection over Union (IoU) score of 0.558 and an F1 Score of 0.793, reflecting its capability to distinguish tumor cell nuclei from HUVEC nuclei. Specifications

Model: StarDist for segmenting tumor cell nuclei from the DAPI fluorescence channel

Training Dataset:

Number of Images: 48 paired fluorescence microscopy images and label masks

Microscope: Spinning disk confocal microscope (3i CSU-W1) with a 20x objective, NA 0.8

Data Type: Fluorescence microscopy images of the DAPI channel with manually segmented masks

File Format: TIFF (.tif)

Fluorescence Images: 16-bit

Masks: 8-bit

Image Size: 920 x 920 pixels (Pixel size: 0.6337 x 0.6337 µm²)

Model Capabilities:

Segment Tumor Cell Nuclei: Detects individual tumor cell nuclei in the DAPI channel while distinguishing them from HUVEC nuclei

Measure Intensity: Enables measurement of CD44, ICAM1, ICAM2, or Fibronectin intensity around or under tumor cell nuclei in respective channels

Performance:

Average IoU: 0.558

Average F1 Score: 0.793

Model Training: Conducted using ZeroCostDL4Mic (HenriquesLab/ZeroCostDL4Mic)

Reference Fast label-free live imaging reveals key roles of flow dynamics and CD44-HA interaction in cancer cell arrest on endothelial monolayers

Gautier Follain, Sujan Ghimire, Joanna W. Pylvänäinen, Monika Vaitkevičiūtė, Diana Wurzinger, Camilo Guzmán, James RW Conway, Michal Dibus, Sanna Oikari, Kirsi Rilla, Marko Salmi, Johanna Ivaska, Guillaume Jacquemet bioRxiv 2024.09.30.615654; doi: https://doi.org/10.1101/2024.09.30.615654

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/13443221

https://doi.org/10.5281/zenodo.13443221

Stardist model and training dataset for automated tracking of MDA-MB-231 and BT20 cells#

Hussein Al-Akhrass, Johanna Ivaska, Guillaume Jacquemet

Published 2021-05-26

Licensed CC-BY-4.0

StarDist Model: The StarDist model was generated using the ZeroCostDL4Mic platform (Chamier et al., 2021). This custom StarDist model was trained for 300 epochs using 46 manually annotated paired images (image dimensions: (1024, 1024)) with a batch size of 2, an augmentation factor of 4 and a mae loss function. The StarDist “Versatile fluorescent nuclei” model was used as a training starting point. Key python packages used include TensorFlow (v 0.1.12), Keras (v 2.3.1), CSBdeep (v 0.6.1), NumPy (v 1.19.5), Cuda (v 11.0.221). The training was accelerated using a Tesla P100GPU. The model weights can be used in the ZeroCostDL4Mic StarDist 2D notebook or in the StarDist Fiji plugin.

StarDist Training dataset: Paired microscopy images (fluorescence) and corresponding masks

Microscopy data type: Fluorescence microscopy (SiR-DNA) and masks obtained via manual segmentation (see HenriquesLab/ZeroCostDL4Mic for details about the segmentation)

Cells were imaged using a 20x Nikon CFI Plan Apo Lambda objective (NA 0.75) one frame every 10 minutes for 16h.

Cell type: MDA-MB-231 cells and BT20 cells

File format: .tif (16-bit for fluorescence and 8 and 16-bit for the masks)

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/4811213

https://doi.org/10.5281/zenodo.4811213

Stardist_MiaPaCa2_from_CD44#

Gautier Follain, Sujan Ghimire, Joanna Pylvänäinen, Johanna Ivaska, Guillaume Jacquemet

Published 2024-08-29

Licensed CC-BY-4.0

This repository contains a StarDist deep learning model designed for segmenting MiaPaCa2 cells from the CD44 channel in fluorescence microscopy images. The model is capable of accurately segmenting individual MiaPaCa2 cells while excluding HUVECs. Trained on a small dataset, the model achieved an Intersection over Union (IoU) score of 0.884 and an F1 Score of 0.950, indicating high precision in cell segmentation. Specifications

Model: StarDist for segmenting MiaPaCa2 cells from the CD44 fluorescence channel

Training Dataset:

Number of Images: 8 paired fluorescence microscopy images and label masks

Microscope: Spinning disk confocal microscope (3i CSU-W1) with a 20x objective, NA 0.8

Data Type: Fluorescence microscopy images of the CD44 channel, obtained after immunofluorescence staining with primary and secondary antibodies and manually segmented masks

File Format: TIFF (.tif)

Fluorescence Images: 16-bit

Masks: 8-bit

Image Size: 920 x 920 pixels (Pixel size: 0.6337 x 0.6337 µm²)

Model Capabilities:

Segment MiaPaCa2 Cells: Accurately detects individual MiaPaCa2 cells while ignoring HUVECs

Measure CD44 Intensity: Allows for the measurement of CD44 intensity around MiaPaCa2 cells, specifically from the CD44 channel

Performance:

Average IoU: 0.884

Average F1 Score: 0.950

Model Training: Conducted using ZeroCostDL4Mic (HenriquesLab/ZeroCostDL4Mic)

Reference Fast label-free live imaging reveals key roles of flow dynamics and CD44-HA interaction in cancer cell arrest on endothelial monolayers

Gautier Follain, Sujan Ghimire, Joanna W. Pylvänäinen, Monika Vaitkevičiūtė, Diana Wurzinger, Camilo Guzmán, James RW Conway, Michal Dibus, Sanna Oikari, Kirsi Rilla, Marko Salmi, Johanna Ivaska, Guillaume Jacquemet bioRxiv 2024.09.30.615654; doi: https://doi.org/10.1101/2024.09.30.615654

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/13442877

https://doi.org/10.5281/zenodo.13442877

SynapseNet Training Data#

Constantin Pape

Published 2024-12-01

Licensed CC-BY-4.0

This dataset contains room-temperature single-axis TEM tomograms from Schaffer collateral and mossy fiber synapses in organotypic hippocampal slices. The tomograms were published in the two studies [1, 2]. The data was re-used for training deep neural networks to segment different synaptic structures in electron micrographs in [3]. For the tomograms, organotypic slices were prepared from the hippocampi of neonatal mice according to the interface protocol55 and vitrified after 28 days in vitro in culture medium supplemented with 20% (w/v) bovine serum albumin using an HPM100 (Leica) high-pressure freezing device. The dataset also contains 23 tomograms resulting from chemically-fixed material, which were also published in (Maus et al., 2020). For these tomograms, wild-type animals at postnatal day 28 were transcardially perfused under deep anesthesia, first with 0.9% sodium chloride solution, and then one of two fixatives (Fixative 1: Ice-cold 4% paraformaldehyde, 2.5% glutaraldehyde in 0.1 M phosphate buffer16; Fixative 2: 37° C 2% paraformaldehyde, 2.5% glutaraldehyde, 2 mM CaCl2, in 0.1 M cacodylate buffer56). Brains were rinsed and sectioned coronally through the dorsal hippocampus in an ice-cold 0.1 M phosphate buffer using a VT 1200S vibratome (Leica) (step size 100 µm; amplitude 1.5 mm, speed 0.1 mm/sec). Hippocampal CA3 subregions were excised using a 1.5 mm diameter biopsy punch and high-pressure frozen on the same day in 20% (w/v) bovine serum albumin using an HPM100 (Leica) high-pressure freezing device. For both sample preparations, automated freeze-substitution was performed. Tomograms were collected using a 200 kV JEM-2100 (JEOL) transmission electron microscope equipped with an 11 MP Orius SC1000 CCD camera (Gatan). Tilt-series (tilt range +/- 60°; 1° angular increments) were acquired at 30 000x magnification using SerialEM58. Tomographic reconstructions were generated using weighted back-projection with etomo.The data is organized into two different subfolders for data with annotations for “vesicles” and “active_zones”. Each of these subfolders is further subdivided into “train” and “test” folders, which containtomograms for the two different sample preparations in “chemical_fixation” and “single_axis_tem”.Each tomogram and the corresponding annotation is stored as a hdf5 file, containing the following internal datasets:- raw: The tomogram data.- labels/vesicles: Annotations for the synaptic vesicles, annotated with IMOD, further postprocessed and then exported to instance masks. (for tomograms in “vesicles”)- labels/AZ: Annotations for the active zone, annotated with IMOD and exported to binary masks. [1] Imig et al., The Morphological and Molecular Nature of Synaptic Vesicle Priming at Presynaptic Active Zones, Neuron, 2014, DOI:10.1016/j.neuron.2014.10.009[2] Maus et al., Ultrastructural Correlates of Presynaptic Functional Heterogeneity in Hippocampal Synapses, Cell Reports, 2020, DOI: 10.1016/j.celrep.2020.02.083[3] Muth, Moschref et al., 2024, Preprint to be published

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/14330011

https://doi.org/10.5281/zenodo.14330011

Synthetic cells#

Vebjorn Ljosa, Katherine L. Sokolnicki, Anne E. Carpenter

Published 2012-06-28

Licensed CC-BY-NC-SA-3.0

One of the principal challenges in counting or segmenting nuclei is dealing with clustered nuclei. To help assess algorithms performance in this regard, this synthetic image set consists of five subsets with increasing degree of clustering.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://bbbc.broadinstitute.org/BBBC004

Synthetic images and segmentation masks simulating HL-60 cell nucleus in 3D#

David Svoboda, Michal Kozubek, Stanislav Stejskal, Teresa Zulueta-Coarasa

Published 2024-11-26

Licensed CC-BY-3.0

One of the principal challenges in counting or segmenting nuclei is dealing with clustered nuclei. To help assess algorithms performance in this regard, this synthetic image set consists of four subsets with increasing degree of clustering. Each subset is also provided in two different levels of quality: high SNR and low SNR.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.ebi.ac.uk/bioimage-archive/galleries/ai/analysed-dataset/S-BIAD1492/

TNBC#

Naylor Peter Jack, Walter Thomas, Laé Marick, Reyal Fabien

Published 2018-02-16

Licensed CC-BY-4.0

Involves an annotated large number of cells, including normal epithelial and myoepithelial breast cells (localized in ducts and lobules), invasive carcinomatous cells, fibroblasts, endothelial cells, adipocytes, macrophages and inflammatory cells (lymphocytes and plasmocytes). In total, our data set consists of 50 images with a total of 4022 annotated cells, the maximum number of cells in one sample is 293 and the minimum number of cells in one sample is 5, with an average of 80 cells per sample and a high standard deviation of 58. The annotation was performed by three experts: an expert pathologist and two trained research fellows. Each sample was annotated by one of the annotators, checked by another one and in case of disagreement, a consensus was established by discussion among the 3 experts.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://paperswithcode.com/dataset/tnbc

Training set of microscopy images for Dietler et al. Nature Communications 2020#

Nicola Dietler, Matthias Minder, Vojislav Gligorovski, Economou, Augoustina Maria, Joly, Denis Alain Henri Lucien, Ahmad Sadeghi, Chan, Chun Hei Michael, Mateusz Kozinski, Martin Weigert, Anne-Florence Bitbol, Rahi, Sahand Jamal

Published 2021-12-07

Licensed CC-BY-4.0

Training set of microscopy images for Dietler et al. Nature Communications 2020

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/5765648

https://doi.org/10.5281/zenodo.5765648

Volumetric segmentation of biological cells and subcellular structures for optical diffraction tomography images - dataset#

Martyna Mazur, Wojciech Krauze

Published 2023-06-16

Licensed CC-BY-4.0

This dataset includes 4 files with segmentation results for 4 different ODT reconstructions of SH-SY5Y neuroblastoma cell. The segmentation results contain:

3D binary masks of biological cells obtained through Cellpose [1] and ODT-SAS;
3D binary masks of organelles: nucleoli and lipid structures (LS) obtained through slice-by-slice manual segmentation&nbsp;and ODT-SAS.

All files are .*mat files.

The files REC_SH-SY5Y_1.mat, REC_SH-SY5Y_2.mat and REC_SH-SY5Y_3.mat consist of 7 variables:

RECON – tomographic reconstruction of SH-SY5Y neuroblastoma cell; n_imm – refractive index of object immersion medium; dx – object space sample size in XY [(\mu m)]; rayXY – xy-coordinates of illumination vectors;

maskManual – table with manually determined 3D binary masks of organelles; maskCellpose – 3D binary mask of biological cell obtained through Cellpose; maskODTSAS – table with 3D binary masks of biological cell and their organelles obtained through ODT-SAS.

File REC_SH-SY5Y_4.mat includes masks for the ODT-SAS and Cellpose segmentation of three closely packed cells and consists of 5 variables: RECON, n_imm, dx, maskCellpose and maskODTSAS.

Access a particular 3D binary mask from ‘maskManual’ and ‘maskODTSAS’ tables, using the following names: ‘Cell’, ‘Nucleoli’, ‘LS’. For example:

cellMask = maskODTSAS.Cell{1};

[1] Stringer, C., Wang, T., Michaelos, M., & Pachitariu, M. (2021). Cellpose: a generalist algorithm for cellular segmentation. Nature methods, 18(1), 100-106.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/8188948

https://doi.org/10.5281/zenodo.8188948

ZeroCostDL4Mic - Stardist 2D example training and test dataset (light)#

Johanna Jukkala, Guillaume Jacquemet

Published 2023-05-19

Licensed CC-BY-4.0

Name: ZeroCostDL4Mic - Stardist 2D example training and test dataset (light)

(see our Wiki for details)

Data type: Paired microscopy images (fluorescence) and corresponding masks

Microscopy data type: Fluorescence microscopy (SiR-DNA) and masks obtained via manual segmentation (see https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki/Stardist for details about the segmentation)

Microscope: Spinning disk confocal microscope with a 20x 0.8 NA objective

Cell type: DCIS.COM LifeAct-RFP cells

File format: .tif (16-bit for fluorescence and 8 and 16-bit for the masks)

Image size: 1024x1024 (Pixel size: 634 nm)

Author(s): Johanna Jukkala1,2 and Guillaume Jacquemet1,2

Contact email: guillaume.jacquemet@abo.fi

Affiliation :

Faculty of Science and Engineering, Cell Biology, Åbo Akademi University, 20520 Turku, Finland
Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520 Turku, Finland

Funding bodies: G.J. was supported by grants awarded by the Academy of Finland, the Sigrid Juselius Foundation and Åbo Akademi University Research Foundation (CoE CellMech) and by Drug Discovery and Diagnostics strategic funding to Åbo Akademi University.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/7949940

https://doi.org/10.5281/zenodo.7949940

ZeroCostDL4Mic - Stardist example training and test dataset#

Johanna Jukkala, Guillaume Jacquemet

Published 2020-03-17

Licensed CC-BY-4.0

Name: ZeroCostDL4Mic - Stardist example training and test dataset

(see our Wiki for details)

Data type: Paired microscopy images (fluorescence) and corresponding masks

Microscopy data type: Fluorescence microscopy (SiR-DNA) and masks obtained via manual segmentation (see HenriquesLab/ZeroCostDL4Mic for details about the segmentation)

Microscope: Spinning disk confocal microscope with a 20x 0.8 NA objective

Cell type: DCIS.COM LifeAct-RFP cells

File format: .tif (16-bit for fluorescence and 8 and 16-bit for the masks)

Image size: 1024x1024 (Pixel size: 634 nm)

Author(s): Johanna Jukkala1,2 and Guillaume Jacquemet1,2

Contact email: guillaume.jacquemet@abo.fi

Affiliation :

Faculty of Science and Engineering, Cell Biology, Åbo Akademi University, 20520 Turku, Finland
Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520 Turku, Finland

Associated publications: Unpublished

Funding bodies: G.J. was supported by grants awarded by the Academy of Finland, the Sigrid Juselius Foundation and Åbo Akademi University Research Foundation (CoE CellMech) and by Drug Discovery and Diagnostics strategic funding to Åbo Akademi University.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://zenodo.org/records/3715492

https://doi.org/10.5281/zenodo.3715492

cellpose training data#

Carsen Stringer, Tim Wang, Michalis Michaelos, Marius Pachitariu

Published 2020-12-14

Licensed CUSTOM LICENSE

This is a cellpose training dataset. Cellpose is a generalist deep learning model for cell segmentation.

Tags: Ai-Ready, Exclude From Dalia

Content type: Data

https://www.cellpose.org/dataset

Ai-ready (86)

Contents

Ai-ready (86)#

2018 Data Science Bowl#

3D Ground Truth Annotations of Nuclei in 3D Microscopy Volumes#

3D HL60 Cell line (synthetic data)#

3D cell shape of Drosophila Wing Disc#

3D light-sheet microscopy data for SELMA3D 2024 challenge - Training subset with annotations#

3D nuclei instance segmentation dataset of fluorescence microscopy volumes of C. elegans#

A deep learning approach to quantify auditory hair cells#

An annotated fluorescence image dataset for training nuclear segmentation methods#

An annotated high-content fluorescence microscopy dataset with Hoechst 33342-stained nuclei and manually labelled outlines#

An image-based data-driven analysis of cellular architecture in a developing tissue#

Assessment of Residual Breast Cancer Cellularity after Neoadjuvant Chemotherapy using Digital Pathology#

Automatic labelling of HeLa “Kyoto” cells using Deep Learning tools#

BCCD Dataset#

Breast Cancer Nuclei images for DL Training + ZeroCostDL4Mic StarDist Model#

Breast Cancer Semantic Segmentation (BCSS) dataset#

CellBinDB: A Large-Scale Multimodal Annotated Dataset#

Cellpose training data and scripts from “Inhibition of CERS1 in aging skeletal muscle exacerbates age-related muscle impairments”#

Cellpose training data and scripts from “Machine learning for histological annotation and quantification of cortical layers”#

Chinese Hamster Ovary Cells#

Combining StarDist and TrackMate example 1 - Breast cancer cell dataset#

Combining StarDist and TrackMate example 2 - T cell dataset#

Combining StarDist and TrackMate example 3 - Flow chamber dataset#

CryoNuSeg#

Deep learning segmentation projects of FIB-SEM dataset of U2-OS cell#

Deep learning training data (JOVE)#

DeepBacs – Bacillus subtilis fluorescence segmentation dataset#

DeepBacs – Escherichia coli bright field segmentation dataset#

DeepBacs – Mixed segmentation dataset and StarDist model#

DeepBacs – Staphylococcus aureus widefield segmentation dataset#

Detection and Segmentation of Cell Nuclei in Virtual Microscopy Images A Minimum-Model Approach#

Drosophila Kc167 cells#

Effect of local topography on cell division of Staphylococci sp.#

Embryonic mice ultrasound volumes with body and brain volume segmentation masks#

Fiber and vessel dataset for segmentation and characterization#

Go-Nuclear. A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context#

Ground-truth cell body segmentation used for Starfinity training#

HPA Nucleus Segmentation (DPNUnet)#

HT1080WT cells embedded in 3D collagen type I matrices - manual annotations for cell instance segmentation and tracking#

Human HT29 colon-cancer cells#

Human Hepatocyte and Murine Fibroblast cells Co-culture experiment#

Human Lung Tissue Microscopy (DIC, Fluorescence, Cell and Nuclei Semantic Instance Annotations)#

Human U2OS cells (out of focus)#

LMRG Image Analysis Study - FISH datasets#

LMRG Image Analysis Study - nuclei datasets#

LyNSeC: Lymphoma Nuclear Segmentation and Classification#

MIDOG 2021#

Melanoma Histopathology Dataset with Tissue and Nuclei Annotations#

MemBrain-seg training data#

MoNuSeg Dataset#

MonuSAC 2020#

Mouse embryo blastocyst cells#

NeurIPS 2022 Cell Segmentation Competition Dataset#

NuInsSeg#

Nuclei of U2OS cells in a chemical screen#

Nuclei of mouse embryonic cells#

OCELOT: Overlapped Cell on Tissue Dataset for Histopathology#

Parhyale 3D segmentation dataset#

Platynereis EM training data#

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides#

ProdgerLab-StarDist-HIV Target Cell Training Set#

Root tissue segmentation dataset#

Segmentation of Nuclei in Histopathology Images by deep regression of the distance map#

Segmenting cells in a spheroid in 3D using 2D StarDist within TrackMate#

Simulated HL60 cells (from the Cell Tracking Challenge)#

Single-cell approach dissecting agr quorum sensing dynamics in Staphylococcus aureus#

StarDist Adipocyte Segmentation Training data, Training Notebook and Model#

StarDist model and data for the segmentation of Yersinia enterocolitica cells in widefield images#

StarDist_AsPC1_Lifeact#

StarDist_BF_Monocytes_dataset#

StarDist_BF_Neutrophil_dataset#

StarDist_BF_cancer_cell_dataset_10x#

StarDist_BF_cancer_cell_dataset_20x#

StarDist_Fluorescent_cells#

StarDist_HUVEC_nuclei_dataset#

StarDist_TumorCell_nuclei#

Stardist model and training dataset for automated tracking of MDA-MB-231 and BT20 cells#

Stardist_MiaPaCa2_from_CD44#