Supplementary materials

SPIN: A Method ofSkeleton-based Polarity Identification for Neurons

Yi-Hsuan Lee Yen-Nan Lin Chung-Chuan Lo

Chung-Chuan Lo()Yi-Hsuan Lee  Yen-Nan Lin

Institute of Systems Neuroscience, National Tsing Hua University, Hsinchu 30013, Taiwan;

e-mail:

Tel: +886-3-574-2014, +886-3-571-5131 ext. 80390

Fax: +886-3-571-5934

Chung-Chuan Lo

Brain Research Center, National Tsing Hua University, Hsinchu 30013, Taiwan

Contents

A.Terminology

B.Lists of neurons used in the study

C. Definition of morphological features

C.1 Length-related features

C.2 Branch-related features

C.3 Volume-related features

D.Parameter definition and default values

E.Program tutorial

E.1 Quick guide

E.2 Operation instructions

E.2.1 The very first step

E.2.2 Manual data preprocessing

E.2.3 Classifier training

E.2.4 Polarity identification

F.References

A.Terminology

The terms used in the present paper follows those used in TREES toolbox(Cuntz, Forstner, Borst, & Häusser 2010). A neuron's structure can be represented by a set of interconnected nodes. There are three types of nodes: continuation points, branch points, and terminal points. Two connected nodes form a segment. A branchmay be composed of more than one segment and is delimited by either branch points or a termination point.

B.Lists of neurons used in the study

PB neuronslisted by their IDs in theFlyCircuit database(Chiang et al. 2011). For the sake of simplicity, in this paper we refer to specific neurons by their numbers (first column).

Number / Training neuron ID / Test neuron ID
1 / Cha-F-400006 / Gad1-F-300123
2 / Cha-F-400012 / Gad1-F-500035
3 / Cha-F-500009 / Cha-F-500028
4 / Cha-F-200009 / Cha-F-000098
5 / Cha-F-400017 / Cha-F-100041
6 / Cha-F-200013 / Gad1-F-300027
7 / Cha-F-000014 / Cha-F-200084
8 / Cha-F-000023 / Cha-F-000050
9 / Cha-F-500046 / Gad1-F-300066
10 / Cha-F-200046 / Gad1-F-500065
11 / Cha-F-200068 / Gad1-F-300029
12 / Cha-F-700086 / Gad1-F-600081
13 / Cha-F-500109 / Cha-F-100032
14 / Cha-F-100065 / Cha-F-300072
15 / Cha-F-300152 / Gad1-F-800013
16 / Cha-F-000106 / Cha-F-600001
17 / VGlut-F-300517 / Gad1-F-100004
18 / Gad1-F-400017 / Cha-F-300160
19 / Gad1-F-600003 / Gad1-F-800025
20 / Gad1-F-900011 / Cha-F-000031
21 / Gad1-F-600025
22 / Gad1-F-600033
23 / Gad1-F-800046
24 / Gad1-F-600077
25 / Gad1-F-600084
26 / TH-F-000048
27 / Tdc2-F-300003
28 / Tdc2-F-500000
29 / Tdc2-F-600000
30 / Tdc2-F-400002

MED neurons listed by their IDs in the FlyCircuit database(Chiang et al. 2011). For the sake of simplicity, in the paper we refer to specific neurons by their numbers (first column)

Number / Training neuron ID / Testing neuron ID
1 / 5-HT1B-F-500013 / fru-F-300050
2 / Cha-F-300010 / VGlut-F-500012
3 / Cha-F-100027 / VGlut-F-400884
4 / Cha-F-400101 / VGlut-F-400671
5 / Cha-F-700121 / VGlut-F-300600
6 / Cha-F-100052 / VGlut-F-900011
7 / VGlut-F-200401 / VGlut-F-000130
8 / VGlut-F-400521 / Cha-F-500093
9 / VGlut-F-400577 / Tdc2-F-000022
10 / VGlut-F-800081 / fru-F-800011
11 / VGlut-F-700226[*] / VGlut-F-000188
12 / VGlut-F-800100 / Cha-F-500044
13 / VGlut-F-300494 / VGlut-F-300391
14 / VGlut-F-100277 / VGlut-F-300560
15 / VGlut-F-900093* / fru-F-700075
16 / VGlut-F-000557 / VGlut-F-400142
17 / VGlut-F-000600 / VGlut-F-400360
18 / VGlut-F-300602 / fru-F-800015
19 / VGlut-F-200012 / Tdc2-F-100067
20 / VGlut-F-400013 / Trh-F-300093
21 / VGlut-F-300212
22 / VGlut-F-800017
23 / VGlut-F-300037
24 / VGlut-F-200114
25 / VGlut-F-300103
26 / VGlut-F-400133
27 / fru-F-300054
28 / fru-F-000053
29 / Gad1-F-400107
30 / Trh-F-400043
31 / Trh-F-300113
32 / Tdc2-F-100013
33 / Tdc2-F-100037
34 / Tdc2-F-200049
35 / Tdc2-F-200058
36 / Tdc2-F-200065
37 / Tdc2-F-200066

C. Definition of morphological features

All morphological features are defined relative to a substructure. The definitions can be extended to the complete neural skeleton by replacing all occurrences of the term "root of the substructure" with simply "soma."

C.1 Length-related features

1. Summation of segment lengths

2. Maximum path length

4. Mean ratio of path length to Euclidean length

7. Mean branch length

8. Mean path length

18. Balancing factor

19. Path length to soma

•Path length: For a node within thesubstructure, the path length is the summation of segment lengths between thenode and the root of the substructure. SPIN calculatespath lengthsforall branch points and terminal points within a substructure.

Euclidean length: For a node within the substructure, the Euclidean length is the Euclidean distance between the node and the root of the substructure. SPIN calculatesEuclidean lengthsfor all branch points and terminal points within a substructure.

•Branch length: A branch length is the summation of segment lengthsfor a branch, i.e., the summation of segment length between two branch points or one branch and one terminal point. SPIN calculates branch lengths for all possible branches within a substructure.

Path length to soma: The path length to soma for a substructure is the summation of segmentlengths between the root of a substructure and the soma. Note that this feature is normalized by the longest possible path length in the neuron.

Balancing factor: Cuntz et al.(Cuntz et al. 2010) proposed the idea of a balancing factoras a weighting between the material cost and conduction time during the construction of neuronal branches. The authors proposed that the construction of neuronal branches should minimize a total cost given by:

,

wherethe wiring cost is given by the Euclidean distance between the carrier points (unconnected points) and the node on the tree, the path length cost is given bythe path length between the carrier points and the node on the tree and bf is the balancing factor.

However, the authors only used the equation to construct a tree structure with an assumed balancing factor but no method was proposed to estimate the balancing factorfora given neuronal structure. To extract the balancing factor as a morphological feature, SPIN adopts the following procedure: Aseries of tree structures is constructedbased on the nodes of a target neuron with assumed balancing factors ranging from 0 to 1 with an interval of 0.1. Next, the constructedstructure that most resembles the actual skeleton of the target neuron is selected. To find the best fit, weevaluate the similarity between two structures (the actual and the constructed)by calculating an error score defined as:

Then SPINuses the Nelder-Mead simplex algorithm(Lagarias, Reeds, Wright, & Wright 1998)to find the balancing factor that minimizes the error score.

C.2 Branch-related features

3. Number of branch points

5. Maximum branch order

6. Mean branch angle

9. Mean branch order

16. Mean asymmetry at branch points

20. Branch order in a complete neuron

•Branch order: For a node within a substructure, the branch order is the number of branch points along the path between the node and the root of the substructure. SPIN calculates branch orders for all branch points andterminal points within a substructure.

•Branch angle: For a branch point within a substructure, the branch angle is the angle between the two descendingsegments of a branch point. SPIN calculates branch angles for all branch points within a substructure.

•Branch order in a complete neuron: The branch order in a complete neuron for a substructure is the number of branch points along the path between the root of the substructure and the soma. Note that this feature is normalized by the largest possible branch order in the neuron.

•Branch point asymmetry: Theasymmetry at a branch point is the asymmetry in the numbers of descending terminal points arising from the two directly descending nodes. Assuming the number of descending terminal points is for one directly descending node and for the other directly descending node, the branch point asymmetry is defined as/(+(assuming). SPIN calculates asymmetry values for all branch points within a substructure.

C.3 Volume-related features

10. Ratio of width(x direction) to height (y direction) of the substructure

11. Ratio of width(x direction) to depth (z direction) of the substructure

12. Center of mass of the substructure in the x direction

13.Center of mass of the substructure in the y direction

14.Center of mass of the substructure in the z direction

15. Volume of the convex hull

17. Mean volume of Voronoi pieces

•Convex hull: The convex hull of a substructure is the smallest convex set containing all nodes of the substructure.

•Voronoi pieces: The Voronoi algorithm subdivides the convex hull enclosinga substructure enclosed by the convex hull using voronoi-algorithm into regions containing exactly one node called Voronoi pieces.

1

Sub-step / Processing / Parameter name / PB / MED / Blowfly / Explanation
Artificial branch removal / Trunk isolation / / 3 / 3 / 3 / Number of cleaningiterations. To locate the neuron trunk, terminal branches that are shorter than the longest branch length are removed.
This process is repeated times.
/ 0.22 / 0.37 / 0.37 / The maximum relative branch length to be removed completely or the percentage of branch lengths to remove. (see text for details)
Dividing point scan / Undividedbranch scan / / 0.10 / 0.12 / 0.15 / The minimum length of an undivided branch.
Critical point scan (splits from trunks) / / 0.35 / 0.35 / 0.35 / The threshold for defining critical points (see text for details).
/ 0.85 / 0.85 / 0.85 / The minimum percentage of descending terminal points following a critical point.
Dividing point determination / / 0.01 / 0.08 / 0.001 / The minimum number of descending terminal points exclusively following a real dividing point.

D.Parameter definition and default values

1

E.Program tutorial

E.1 Quick guide

The SPIN software package comes with a sample classifier and a set of sample neurons for testing. Here we guide usersto apply the enclosed classifier and neuron data to polarity identification, the last stage of the SPIN system:

Step1: Setting up a Matlab search path

1)ExecutegoInclude.m. This script automatically adds required functions to the Matlab search path.

Step2: Performing neuron polarity identification

1)Execute goIdentifyPolarity_GUI.m. The script opens a GUI (graphical user interface) of SPIN for polarity identification.

2)Follow steps a-f shown on the figure below.

  1. Enter the classifier name (here, MED) and select the type (e.g., Exhaustive) of the desired classifier.
  2. Give an arbitrary name to your test data (here we use “test”).
  3. Click on the “Browse” buttons to select a file for the list of neuron names (here, ‘./fileList.txt’) and a directory (here, ‘./SWC_labeled’) that contains the SWC files of your data.
  4. If your SWC files contain known neuronal polarity information (which usually comes from experimental results), select “Compare with experimental results” to export terminal-level accuracies.
  5. You can use default parameters, which are the PB parameters for neurons having less than 50 terminal points and MED parameters for neurons having more than 50 terminal points.
  6. Click on “GO” to begin polarity identification.

Upon completion, a message box will appear (Fig. E.1.2):

Identified results will be displayed on the panel (Fig. E.1.3).

Users can also locate the results under the directory ‘./Result/test’. These are:

classifiedResultMat: The .mat file of each neuron (including their substructures).

classifiedResultPlot: Resulting plots of each neuron.

classifiedResultSWC: Resulting SWC files.

cleanedTreePlot: Resulting plot after artificial branch removal.

morphoClustPlot: Resulting plot after morphological clustering.

resultRecord_test.txt: Table that summarizes the results. The names of neurons are displayed in the first column. If “Compare with experimental results” were selected, the terminal-level accuracies would be displayed in the second column. The following column displays whether warnings were issued in the corresponding substructures. 1: type I warning; 2: type II warning; 3: type I + type II warning. Take neuron “VGlut-F-900011” for example, the terminal-level accuracy was around 72%, and the neuronal skeleton was divided into three substructures, with warnings issued for each of them. Their warning types were type II, type I, and type II, respectively.

logFile.txt: A text file that displays error messages (if any)

metaData_test.txt: A text file that records the date, classifier source, classifier type, swc files source, and parameters.

E.2 Operation instructions

In this set of instructions, we show how to use the entire SPIN system starting with training data. We will guide users through all three stages of SPIN: Manual data preprocessing, classifier training and polarity identification.

E.2.1 The very first step

1)ExecutegoInclude.m. This script automatically adds required functions to the Matlab search path.

2)Make sure that the main soma is the first node in the swc file. If not, set the main soma to root by using TREES toolbox function redirect_tree.

E.2.2 Manual data preprocessing

Artificial branch removal

Step1: Data preparation

1)Place neuronal skeleton data files (in SWC format) under the directory ‘swc_rawdata'

2)Make a text file containing the list of names of the training neurons. Name the file as ‘fileList.txt’ and place it under the SPIN directory. The name of each training neuron should be identical to its SWC file name (without the file extension).

Note: You can change the paths and the filenames directly by editing the value of three variables, data.tarDir (line 41), data.srcDir (line 42) and data.nameList (line 46) in the file ‘./data_preprocessing/GUI_ManualDenoise.m’. But be sure to put a ‘/’ at the end of each path.

Step2: Removing artificial branches manually

1)Execute goManualDenoise.m and two panels will show up (Figs. E.2.1 and E.2.2).

2)Select a piece of trunk that you would like to clean byfirst clicking on the starting point of the trunk (Fig. E.2.3) and then clicking on the end point of the trunk (Fig. E.2.4). The selected trunk will turn red.

3)Click on “Clean” to remove all terminal branches on the selected trunk (Fig. E.2.5).

4)Repeat the process until all artificial branches are removed.

5)Click on “Export” to export a cleaned neuron skeleton (the files can be found under ‘./SWC_cleaned’ with the default setting). If you want to skip this step, simply click “Export all”to copy all the files in swc_rawdata directory toSWC_cleaned directory for the next step.

Note: Be sure to unselect the figure tool while clicking on the skeleton.

Morphological clustering & polarity labeling

Note: Here SPIN automatically reads SWC files (under ‘./SWC_cleaned’ ) and the list of neuron names (‘./fileList.txt’) from the previous step.

Step1: Neuronal polarity labeling

1)Execute goHandLabel.m and two panels will show up (Figs. E.2.6 and E.2.7)

2)First click on a branch point in the display panel and then click on a structure type button (axon or dendrite) in the control panel (Fig. E.2.6) to label the polarity. The color of the substructure descending from the branch point will change to indicate the polarity (Fig. E.2.8).

3)Click the “Export” button on the control panel to export labeled polarity in SWC format. The files will be stored under the directory ‘./SWC_labeled’.

Note: Be sure to unselect the figure tool while clicking on the skeleton.

E.2.3 Classifier training

Step1: Performing classifier training

1)ExecutegoTrainClassifier_GUI.m. One panel will show up.

2)Follow steps a-g shown on the figure below:

  1. Give an arbitrary name to the classifier (here is MED).
  2. Click on “Browse” to select the directory that contains SWC files of training neurons (here, ‘./SWC_labeled’).
  3. Decide whether to perform feature extraction or not.
  4. Enter the IDs of features you want to exclude from feature extraction (e.g, if you want to exclude features #1, #3, #4, #5, #7, please enter 1,3:5,7. Please refer to the main text for the list of the features.)
  5. Select the algorithm(s)for feature selection[†].
  6. Enter the value of k for the k-nearest-neighbor classifier
  7. Click on ‘GO’ to begin classifier training.

After the training is done, the trained classifier is stored in‘./Classifier/MED’.

E.2.4 Polarity identification

In this stage of SPIN, users use the classifier trained in the previous stage to perform polarity identification for neurons with unknown polarity. Please see the Quick Guide for details.

F.References

Chiang, A.-S., Lin, C.-Y., Chuang, C.-C., Chang, H.-M., Hsieh, C.-H., Yeh, C.-W., … Hwang, J.-K. (2011). Three-Dimensional Reconstruction of Brain-wide Wiring Networks in Drosophila at Single-Cell Resolution. Current Biology, 21(1), 1–11. doi:10.1016/j.cub.2010.11.056

Cuntz, H., Forstner, F., Borst, A., & Häusser, M. (2010). One Rule to Grow Them All: A General Theory of Neuronal Branching and Its Practical Application. PLoS Comput Biol, 6(8), e1000877. doi:10.1371/journal.pcbi.1000877

Lagarias, J. C., Reeds, J. A., Wright, M. H., & Wright, P. E. (1998). Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions. SIAM JOURNAL OF OPTIMIZATION, 9, 112–147.

1

[*]These neurons were not included in Vaa3D analysis. See Methods in the main text.

[†]The algorithms for feature selection:

Sequential: Use the k-nearest-neighbor classifier and the leave-one-out test to find the feature that most correlate with the polarity. Next, find a second feature that, in combination with the first feature, most improves the correlation. Repeat this procedure until the correlation can no longer be improved.

Exhaustive: For every possible subset of features, use the k-nearest-neighbor classifier and the leave-one-out test to evaluate the correlation between the feature set and the polarity.

See also