Model Definition

Appendix A

Model definition

The age (A) and sex (S) of individuals were known, so we considered the four third molars on which experts observe the state of development according to the K stages of the Demirjian scale.

Let be the multivariate representation of the development state of the hth third molar for an individual. When the kthcomponent of is placed equal to 1 and the others to 0, this means experts were certain to observe only the kthdental development state (hard evidence). If experts believe that more than one state is possible (soft evidence), they can provide their beliefs, with and Finally, all the evidence (hard or soft) for the hththird molar obtained from the training data set is gathered in , which is a matrix, where n is the size of the training sample.

Let be distributed as a multinomial random variable , where the model parameters are assumed to follow a Dirichlet distribution:

(A1)

whereis the vector of hyper parameters.

Parametric learning

The posterior distribution is obtained by the training data set resulting as a mixture of Dirichlet distributions, as detailed in Corradi et al. (2012):

(A2)

where is the number of polynomials with different basis characterizing the likelihood function, the weights are normalising constants and is the vector of the exponents of the mth polynomial.

Conditional independence

For the age classification problem, the states of the class variableare the three intervals: , and . The size of the ZOI depends on the thresholdsand and the dichotomous analysis is a specific case when . Let and be the joint distributions for the observations and the parameters then, conditionally on the class variable and the sex , each tooth grows independently from each other, i.e. per . The factorization:

(A3)

allows each to be estimated separately for each combination of class of age and sex.

Classification

Consider the th individual to be classified employing his/her dental developments and sex and the information deriving from the training sample. Let be the class of age random variable and the set of hyper parameters for each tooth, assumed available. The classification probability of the th individual belonging to the qth class of age, with , given his/her dental evidence and sex is:

(A4)

where the prior probability depends on the population to which the th subject belongs: here we estimated the distribution conditionally on the sex of the th subject by the available sample proportions. If is observed without uncertainty (hard evidence) then by (A2) and marginalizing with respect to , we obtain:

(A5)

i.e. a mixture of Multinomial-Dirichlet evaluated on the specific sex .

In the case of soft evidence (A4) and (A5) can be generalized as:

(A6)

where:

(A7)

When hard evidence occurs expression (A7) coincides with (A5) and, consequently, also the classification probabilities (A6) and (A4) are also equal.

The classification probability (A6) is then employed to assign the th individual to the class of age with the highest probability. A more general classification rule could specify a classification probabilistic threshold, , as the smallest probability required to assign an individual to a age class.

Classification rule

The classification of the th individual to the class is achieved if the probability of belonging to a certain class of age is greater than the probability threshold determined by the decision rule adopted. Formally:

(A8)

In case an individual is classified to the class, that means, actually, the individual is not classified.

In civil cases we assume suffices. In criminal cases higher are required since misclassifications can cause more severe consequences: here we consider . If the ZOI is introduced and defined by some and , we indicate with the class of age assigned to the individual.

Appendix B

In this appendix we give a more rigorous definition of the efficiency and misclassification indexes (1), (2) and (3) introduced in the Efficiency and misclassification section. The notation makes use of symbols introduced in appendix A.

Efficiency index:

(B1)

Misclassification index for civil cases:

(B2)

Misclassification index for criminal cases: