CAVE: Connectome Annotation Versioning Engine

CAVE: Connectome Annotation Versioning Engine

Dorkenwald S, Schneider-Mizell CM, Brittain D, Halageri A, Jordan C, Kemnitz N, Castro MA, Silversmith W, Maitin-Shephard J, Troidl J, Pfister H, Gillet V, Xenes D, Bae JA, Bodor AL, Buchanan J, Bumbarger DJ, Elabbady L, Jia Z, Kapner D, Kinn S, Lee K, Li K, Lu R, Macrina T, Mahalingam G, Mitchell E, Mondal SS, Mu S, Nehoran B, Popovych S, Takeno M, Torres R, Turner NL, Wong W, Wu J, Yin W, Yu S, Reid RC, da Costa NM, Seung HS, and Collman F.

Nature Methods, 2025.

Advances in electron microscopy, image segmentation, and computational infrastructure have given rise to large-scale and richly annotated connectomic datasets, which are increasingly shared across communities. To enable collaboration, users need to be able to concurrently create annotations and correct errors in the automated segmentation by proofreading. In large datasets, every proofreading edit relabels cell identities of millions of voxels and thousands of annotations like synapses. For analysis, users require immediate and reproducible access to this changing and expanding data landscape. Here, we present the Connectome Annotation Versioning Engine (CAVE), a computational infrastructure that provides scalable solutions for proofreading and flexible annotation support for fast analysis queries at arbitrary time points. Deployed as a suite of web services, CAVE empowers distributed communities to perform reproducible connectome analysis in up to petascale datasets ({textasciitilde}1 mm3) while proofreading and annotating is ongoing.

Acknowledgements

We thank John Wiggins, G. McGrath, and Dave Barlieb for computer system administration and M. Husseini for project administration. We thank Gregory Jefferis, Davi Bock, Eric Perlman, Philipp Schlegel, and Stephan Gerhard for providing feedback on the system design. We thank Gregory Jeffries, Phillip Schlegel, Brock Wester, Stephan Gerhard, Arie Matsliah, Brendan Celli, and Jake Reimer for building tools that leverage the CAVE infrastructure and providing consistent feedback on its performance and implementation. We thank Pedro Nunez Gomez for providing advice about deployment strategies. We would like to thank John Tuthill, Wei-Chung Allen Lee and the FANC community for their collaboration. We would like to thank Viren Jain, and Jeff Lichtman for collaboration on the H01 dataset. We would like to thank Stanley Heinze, Kevin Teodore for collaboration on their datasets. We would like to thank Zetta.ai for collaborations that use CAVE for further datasets. We would like to thank the FlyWire Consortium for collaboration on the FlyWire dataset. We thank the Allen Institute for Brain Science founder, P. G. Allen, for his vision, encouragement and support. Sebastian Seung acknowledges support from the National Institutes of Health (NIH) BRAIN Initiative RF1 MH129268, U24 NS126935, and RF1 MH123400, as well as assistance from Google. Forrest Collman, Nuno da Costa and Clay Reid acknowledge support from NIH RF1MH125932 and from NSF NeuroNex 2 award 2014862. Jakob Troidl and Hanspeter Pfister were supported by NSF grant NCS-FO-2124179. This work was supported by the Intelligence Advanced Research Projects Activity via Department of Interior/Interior Business Center contract numbers D16PC00004, D16PC0005, 2017-17032700004-005 and 2020-20081800401-023. The US Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Intelligence Advanced Research Projects Activity, ODNI, Department of Interior/Interior Business Center or the US Government.