Deliverable D.7.2, WP7

IST-2000-29388

TESTING THE FIRST VERSION

OF THE MULTILINGUAL BALKANET DATABASE

BalkaNet

May 2004

Identification Number:IST-2000-29388

Type:Report-Document

Title:Testing the first version

of the multilingualBalkanet database

Workpackage:WP7

Task:T.7.2

Deliverable:D.7.2

Date:July 2004

Status:Final

Version:1

Deliverable Responsible:MEMODATA

EC Project Officer:Erwin Valentini

Project Coordinator:Prof. Dimitrios Christodoulakis

Director, DBLAB

Computer Engineering and Informatics Department

PatrasUniversity

GR – 265 00 PATRA

Phone: +30 610 960385

Fax:+30 610 960438

e-mail:

Actual Distribution:Project Consortium, Project Officer, EC

CONTENTS

PREFACE

PART I Raw statistics onthe Balkanet Wordnets :

Reference of the XML files used for this evaluation :

Number of synsets by POS and by languages

Coverage of the BCS

Number of ILRs by type and by language

Specific synsets for each language

Part II Evaluation of the equivalence relations and the overlap across the different WordNets

Quality of the equivalence relations by evaluating the performance of the WSD tool

Conclusion

PREFACE

In this task, we will evaluate the multilingual WordNet, build (as seen in the task 7.1) from the individual WordNets.

These wordnets have been developped by the different partners and are available in Bulgarian, Czech, Greek, Romanian, Serbian and Turkish. In this subtask, we will perform different tests to check the compatibility and the matching across the different WordNets.

PART IRaw statistics on the Balkanet Wordnets :

Reference of the XML files used for this evaluation :

First, here we will list the refence of the XML files of each WordNet used for this evaluation.

Language / Size (in KB) / Date
Bulgarian / 9788 / 06/13/2004
Czech / 6129 / 06/08/2004
Greek / 7469 / 05/07/2004
Romanian / 5290 / 06/07/2004
Turkish / 3205 / 06/07/2004
Serbian / 2330 / 03/18/2004

Number of synsets by POS and by languages

Once loaded in the multilingual model, we have computed some statistics on the Balkanet data, that we can compare with the EuroWordNet ones.

We have added too the number of glosses added for each language.

Those statistics are displayed in the table below.

Synsets / No. of senses / Sens./ syns. / Entries / Sens./ entry
EuroWordNet
Dutch Wordnet / Nouns / 34455 / 78% / 54428 / 1,58 / 45972 / 1,18
Verbs / 9040 / 21% / 14151 / 1,57 / 8826 / 1,6
Other / 520 / 1% / 1622 / 3,12 / 1485 / 1,09
Total / 44015 / 100% / 70201 / 1,59 / 56283 / 1,25
Spanish Wordnet / Nouns / 18577 / 79% / 41292 / 2,22 / 23216 / 1,78
Verbs / 2602 / 11% / 6795 / 2,61 / 2278 / 2,98
Other / 2191 / 9% / 2439 / 1,11 / 2439 / 1
Total / 23370 / 100% / 50526 / 2,16 / 27933 / 1,81
Italian Wordnet / Nouns / 30169 / 75% / 34552 / 1,15 / 24903 / 1,39
Verbs / 8796 / 22% / 12473 / 1,42 / 6607 / 1,89
Other / 1463 / 4% / 1474 / 1,01 / 1468 / 1
Total / 40428 / 100% / 48499 / 1,2 / 32978 / 1,47
French Wordnet / Nouns / 17826 / 78% / 24499 / 1.37 / 14879 / 1.65
Verbs / 4919 / 22% / 8310 / 1.69 / 3898 / 2.13
Other / 0 / 0% / 0 / 0 / 0 / 0
Total / 22745 / 100% / 32809 / 1.44 / 18777 / 1.75
German Wordnet / Nouns / 9951 / 66% / 13656 / 1.37 / 12746 / 1.07
Verbs / 5166 / 34% / 6778 / 1.31 / 4333 / 1.56
Other / 15 / 0% / 19 / 1.27 / 19 / 1
Total / 15132 / 100% / 20453 / 1.35 / 17098 / 1.20
Czech Wordnet / Nouns / 9727 / 76% / 13829 / 1.42 / 9277 / 1.49
Verbs / 3097 / 24% / 6120 / 1.98 / 3006 / 2.04
Other / 0 / 0% / 0 / 0 / 0 / 0
Total / 12824 / 100% / 19949 / 1.56 / 12283 / 1.62
Estonian Wordnet / Nouns / 5028 / 65% / 8226 / 1.64 / 7209 / 1.14
Verbs / 2650 / 35% / 5613 / 2.12 / 3752 / 1.50
Other / 0 / 0% / 0 / 0 / 0 / 0
Total / 7678 / 100% / 13839 / 1.80 / 10961 / 1.26
English WordNet Addition / Nouns / 4751 / 29% / 14188 / 2,99 / 2524 / 5,62
Verbs / 11363 / 69% / 25761 / 2,27 / 14726 / 1,75
Other / 247 / 2% / 639 / 2,59 / 70 / 9,13
Total / 16361 / 100% / 40588 / 2,48 / 17320 / 2,34
WordNet1.5 / Nouns / 60521 / 64% / 107428 / 1,78 / 88175 / 1,22
Verbs / 11363 / 12% / 25768 / 2,27 / 14734 / 1,75
Other / 22631 / 24% / 54406 / 2,4 / 23708 / 2,29
Total / 94515 / 100% / 187602 / 1,98 / 126617 / 1,48
Balkanet
WordNet2.0 / Nouns / 79689 / 69% / 141691 / 1,78 / 115775 / 1,22
Verbs / 13508 / 12% / 24632 / 1,82 / 11306 / 2,18
Adjectives / 18563 / 16% / 31016 / 1,67 / 21495 / 1,44
Other / 3664 / 3% / 5808 / 1,59 / 4660 / 1,25
Total / 115424 / 100% / 203147 / 1,76 / 153236 / 1,33
Gloss / 115431 / 100% / 115427 / 1,00 / 114419 / 1,01
Greek / Nouns / 14478 / 78% / 18563 / 1,28 / 14654 / 1,27
Verbs / 3542 / 19% / 5334 / 1,51 / 2917 / 1,83
Adjectives / 639 / 3% / 889 / 1,39 / 675 / 1,32
Other / 24 / 0% / 40 / 1,67 / 31 / 1,29
WordNet / Total / 18675 / 100% / 24826 / 1,33 / 18442 / 1,35
Gloss / 17713 / 95% / 18702 / 1,06 / 17614 / 1,06
Czech / Nouns / 20711 / 74% / 30959 / 1,49 / 23603 / 1,31
Verbs / 4973 / 18% / 8755 / 1,76 / 5985 / 1,46
Adjectives / 2128 / 8% / 3015 / 1,42 / 1903 / 1,58
Other / 164 / 1% / 255 / 1,55 / 213 / 1,20
WordNet / Total / 27976 / 100% / 42984 / 1,54 / 31704 / 1,36
Gloss / 865 / 3% / 865 / 1,00 / 845 / 1,02
Romanian / Nouns / 10410 / 65% / 18110 / 1,74 / 12191 / 1,49
Verbs / 4032 / 25% / 8587 / 2,13 / 3994 / 2,15
Adjectives / 832 / 5% / 1227 / 1,47 / 987 / 1,24
Other / 742 / 5% / 1205 / 1,62 / 765 / 1,58
WordNet / Total / 16016 / 100% / 29129 / 1,82 / 17937 / 1,62
Gloss / 16394 / 102% / 16394 / 1,00 / 16098 / 1,02
Serbian / Nouns / 4853 / 74% / 7872 / 1,62 / 6701 / 1,17
Verbs / 1494 / 23% / 3549 / 2,38 / 2199 / 1,61
Adjectives / 232 / 4% / 299 / 1,29 / 268 / 1,12
Other / 7 / 0% / 10 / 1,43 / 10 / 1,00
WordNet / Total / 6586 / 100% / 11730 / 1,78 / 9178 / 1,28
Gloss / 6408 / 97% / 6408 / 1,00 / 6314 / 1,01
Turkish / Nouns / 8616 / 77% / 12794 / 1,48 / 10290 / 1,24
Verbs / 2231 / 20% / 3549 / 1,59 / 2113 / 1,68
Adjectives / 365 / 3% / 608 / 1,67 / 472 / 1,29
Other / 0 / 0% / 0 / - / 0 / -
WordNet / Total / 11212 / 100% / 16951 / 1,51 / 12875 / 1,32
Gloss / 4692 / 42% / 4696 / 1,00 / 4511 / 1,04
Bulgarian / Nouns / 13018 / 65% / 22968 / 1,76 / 18860 / 1,22
Verbs / 3958 / 20% / 14425 / 3,64 / 8610 / 1,68
Adjectives / 3059 / 15% / 5043 / 1,65 / 3844 / 1,31
Other / 5 / 0% / 7 / 1,40 / 7 / 1,00
WordNet / Total / 20040 / 100% / 42443 / 2,12 / 31321 / 1,36
Gloss / 20040 / 100% / 20040 / 1,00 / 20015 / 1,00

Some remarks on these data :

-the number of synsets for the individual WordNets ranges from 6586 (Serbian) to 27976 (Czech) with an average of 16751 synsets. This figure is to be compared with the 115524 synsets of WordNet 2.0.

-Nouns and verbs are over represented while adjectives and adverbs (other) are under represented, compared with WordNet 2.0.

-In each WordNet, almost every synsets have a gloss, except for the Czech (3%) and the Turkish (42 %). Nevertheless, we have observed that a few glosses are actually in English.

  • Bulgarian : 2 not translated glosses
  • Czech : 1 not translated glosses
  • Greek : 6 not translated glosses
  • Romanian : 2 not translated glosses
  • Turkish : 0 not translated glosses
  • Serbian : 65 not translated glosses

Coverage of the BCS

This table (split in two) shows the number of synsets filled by each partner for each BCS. These synset were intended to be filled in priority, since the base concepts represent the more common concepts, the BCS1 being more common than BCS 2, themselves more common than BCS3.

English synsets / Bulgarian synsets / Czech synsets / Greek synsets
BCS1 / 1219 / 1219 / 100% / 1219 / 100% / 1219 / 100%
BCS2 / 3471 / 3471 / 100% / 3471 / 100% / 3462 / 99,7%
BCS3 / 3827 / 3827 / 100% / 3823 / 99,9% / 3783 / 99%
Total / 8517 / 8517 / 100% / 8514 / 99,9% / 8464 / 99%
Romanian synsets / Turkish synsets / Serbian synsets
BCS1 / 1192 / 98% / 1206 / 99% / 1219 / 100%
BCS2 / 3366 / 97% / 3221 / 93% / 3120 / 90%
BCS3 / 3589 / 94% / 3187 / 83% / 1154 / 30%
Total / 8147 / 96% / 7614 / 89% / 5493 / 64%

We can see that the BCS 1 synsets are almost perfectly filled for each partner, since the minimum figure is 98 %. The BCS 2 synsets are good, with a minimum of 90 %. The BCS 3 are good too, except for the Serbian, with a percentage of 64%.

So, even if the synsets of the individual WordNets represent about 12 % of WordNet2.0, the synsets of the base concepts are almost completed.

Number of ILRs by type and by language

In this section, we will study the type of relations between synsets (ILRs).

In fact, we have determined two ways of counting these ILRs by WordNet.

-The first one is to count the ILRs as we can find them in each individual WordNet. These ILRs are named in the table below : "real".

-The second one seemed to us interesting to use too : For each individal WordNet, we count the ILRs defined by WordNet 2.0 or by any other individual WordNet, between two synsets, only if the two synsets contains both literals for this language

Let's take the Romanian WordNet : a hypernym ILR will be counted if the two synsets contain at least one Romanian literal, even if this ILR is not specifically defined in the Romanian WordNet.

For example the synset "blue" has the hypernym link with the sysnset "color" only in the Greek WordNet. But as these two synsets have both a Romanian literal, this link has been counted as "computed" for the Romanian WordNet.

This method may be used to benefit from the additions of an individual WordNet to extend the other ones automatically. Of course, some types of relation may be meaningless to be extended by this method, like the relation "eng-derivative".

The two counting methods have been added in the following table :

-The results of the first method are in the column "real"

-The results of the second method are in the column "computed"

To compare, we have added the ILRs statistics for WordNet 2.0.

WordNet 2.0 / Bulgarian / Czech
Computed / Real / Computed / Real
also_see / 3241 / 1,6% / 2110 / 3,9% / 1220 / 4,3% / 1734 / 2,6% / 763 / 2,3%
be_in_state / 1297 / 0,6% / 778 / 1,5% / 604 / 2,1% / 755 / 1,1% / 602 / 1,8%
category_domain / 6166 / 3,0% / 1438 / 2,7% / 1305 / 4,6% / 1271 / 1,9% / 1113 / 3,3%
causes / 219 / 0,1% / 124 / 0,2% / 95 / 0,3% / 132 / 0,2% / 117 / 0,3%
derived / 6566 / 3,2% / 1197 / 2,2% / 1155 / 4,1% / 190 / 0,3% / 0,0%
eng_derivative / 36646 / 17,9% / 15660 / 29,3% / 0,0% / 20443 / 31,1% / 0,0%
holo_member / 12213 / 6,0% / 1275 / 2,4% / 849 / 3,0% / 1521 / 2,3% / 1087 / 3,2%
holo_part / 8642 / 4,2% / 1831 / 3,4% / 1288 / 4,6% / 2556 / 3,9% / 1775 / 5,3%
holo_portion / 788 / 0,4% / 175 / 0,3% / 109 / 0,4% / 459 / 0,7% / 357 / 1,1%
hypernym / 95085 / 46,3% / 18082 / 33,8% / 16916 / 59,9% / 27188 / 41,4% / 23873 / 70,8%
near_antonym / 7642 / 3,7% / 2454 / 4,6% / 1930 / 6,8% / 2413 / 3,7% / 1772 / 5,3%
particle / 106 / 0,1% / 56 / 0,1% / 56 / 0,2% / 1 / 0,0% / 0,0%
region_domain / 1280 / 0,6% / 60 / 0,1% / 28 / 0,1% / 179 / 0,3% / 0,0%
similar_to / 22196 / 10,8% / 7018 / 13,1% / 1564 / 5,5% / 5392 / 8,2% / 1138 / 3,4%
subevent / 409 / 0,2% / 189 / 0,4% / 178 / 0,6% / 233 / 0,4% / 218 / 0,6%
usage_domain / 983 / 0,5% / 60 / 0,1% / 29 / 0,1% / 152 / 0,2% / 0,0%
verb_group / 1750 / 0,9% / 1007 / 1,9% / 922 / 3,3% / 1076 / 1,6% / 916 / 2,7%
Total / 205229 / 100,0% / 53514 / 100% / 28248 / 100,0% / 65695 / 100% / 33731 / 100,0%
WordNet 2.0 / Greek / Romanian
Computed / Real / Computed / Real
also_see / 3241 / 1,6% / 1077 / 2,4% / 210 / 0,9% / 1191 / 2,8% / 379 / 1,8%
be_in_state / 1297 / 0,6% / 600 / 1,3% / 143 / 0,6% / 625 / 1,5% / 533 / 2,5%
category_domain / 6166 / 3,0% / 833 / 1,8% / 0,0% / 715 / 1,7% / 562 / 2,6%
causes / 219 / 0,1% / 102 / 0,2% / 76 / 0,3% / 124 / 0,3% / 120 / 0,6%
derived / 6566 / 3,2% / 81 / 0,2% / 64 / 0,3% / 558 / 1,3% / 0,0%
eng_derivative / 36646 / 17,9% / 12936 / 28,5% / 0,0% / 15720 / 36,6% / 0,0%
holo_member / 12213 / 6,0% / 2199 / 4,9% / 1323 / 5,4% / 979 / 2,3% / 756 / 3,5%
holo_part / 8642 / 4,2% / 3165 / 7,0% / 2708 / 11,1% / 1388 / 3,2% / 980 / 4,6%
holo_portion / 788 / 0,4% / 260 / 0,6% / 162 / 0,7% / 167 / 0,4% / 107 / 0,5%
hypernym / 95085 / 46,3% / 19054 / 42,0% / 18476 / 75,6% / 15572 / 36,3% / 14419 / 67,5%
near_antonym / 7642 / 3,7% / 1576 / 3,5% / 691 / 2,8% / 1817 / 4,2% / 1516 / 7,1%
particle / 106 / 0,1% / 0,0% / 0,0% / 0,0% / 0,0%
region_domain / 1280 / 0,6% / 86 / 0,2% / 0,0% / 50 / 0,1% / 0,0%
similar_to / 22196 / 10,8% / 2197 / 4,8% / 46 / 0,2% / 2687 / 6,3% / 896 / 4,2%
subevent / 409 / 0,2% / 153 / 0,3% / 131 / 0,5% / 171 / 0,4% / 166 / 0,8%
usage_domain / 983 / 0,5% / 57 / 0,1% / 0,0% / 82 / 0,2% / 0,0%
verb_group / 1750 / 0,9% / 945 / 2,1% / 418 / 1,7% / 1052 / 2,5% / 929 / 4,3%
Total / 205229 / 100,0% / 45321 / 100% / 24448 / 100,0% / 42898 / 100% / 21363 / 100,0%
WordNet 2.0 / Turkish / Serbian
Computed / Real / Computed / Real
also_see / 3241 / 1,6% / 992 / 3,2% / 237 / 1,4% / 600 / 3,1% / 104 / 1,0%
be_in_state / 1297 / 0,6% / 551 / 1,8% / 544 / 3,2% / 289 / 1,5% / 134 / 1,3%
category_domain / 6166 / 3,0% / 544 / 1,8% / 533 / 3,2% / 293 / 1,5% / 179 / 1,7%
causes / 219 / 0,1% / 97 / 0,3% / 96 / 0,6% / 66 / 0,3% / 45 / 0,4%
derived / 6566 / 3,2% / 5 / 0,0% / 0,0% / 103 / 0,5% / 99 / 0,9%
eng_derivative / 36646 / 17,9% / 10004 / 32,5% / 0,0% / 7200 / 37,6% / 1964 / 18,7%
holo_member / 12213 / 6,0% / 1059 / 3,4% / 958 / 5,7% / 946 / 4,9% / 717 / 6,8%
holo_part / 8642 / 4,2% / 1681 / 5,5% / 1478 / 8,8% / 617 / 3,2% / 367 / 3,5%
holo_portion / 788 / 0,4% / 220 / 0,7% / 218 / 1,3% / 97 / 0,5% / 37 / 0,4%
hypernym / 95085 / 46,3% / 11398 / 37,0% / 10604 / 63,0% / 6949 / 36,3% / 6186 / 59,0%
near_antonym / 7642 / 3,7% / 1320 / 4,3% / 1295 / 7,7% / 717 / 3,7% / 430 / 4,1%
particle / 106 / 0,1% / 0,0% / 0,0% / 9 / 0,0% / 9 / 0,1%
region_domain / 1280 / 0,6% / 57 / 0,2% / 0,0% / 3 / 0,0% / 0,0%
similar_to / 22196 / 10,8% / 1966 / 6,4% / 29 / 0,2% / 740 / 3,9% / 12 / 0,1%
subevent / 409 / 0,2% / 116 / 0,4% / 116 / 0,7% / 83 / 0,4% / 60 / 0,6%
usage_domain / 983 / 0,5% / 32 / 0,1% / 3 / 0,0% / 20 / 0,1% / 2 / 0,0%
verb_group / 1750 / 0,9% / 744 / 2,4% / 714 / 4,2% / 402 / 2,1% / 139 / 1,3%
Total / 205229 / 100,0% / 30786 / 100% / 16825 / 100,0% / 19134 / 100% / 10484 / 100%

We can see that the distribution is quite homogeneous. But we can notice, the hypernyms ILRs for each individual WordNets are over represented. It seems that this kind of relation has been favoured over the other relation types.

Specific synsets for each language

Most of the synsets in the different monolingual WordNets are actually synsets created by the Princeton WordNet 2.0.

Nevertheless, it's interesting to know the different specific synsets created by each partner.

The following table gives the number of specific synsets by language.

Language / Number of specific synsets
Bulgarian / 237
Czech / 0
Greek / 820
Romanian / 0
Turkish / 297
Serbian / 7

We can see that the number of specific synsets are low, and even null for the Czech and the Romanian, except for the Greek with 820 specific synsets, and for the Bulgarian and the Turkish ones with 237 and 297 specific synsets

PART IIEvaluation of the equivalence relations and the overlap across the different WordNets

In this section, we will be interested in the way the Princeton WordNet synsets are filled and with what is the overlap the different WordNets.

For example, this table shows the distribution of the synsets, according to the number of the different languages they contain, i.e., for which there is at least a litteral of the languages. For example, "0 languages" means that there is no literal from any of the BalkaNet partners in these synsets. They contain only only English literals. "6 languages" means that the six BalkaNet partners have added at least one literal in these synsets.

Number of languages / Number of synsets
1 languages / 74146
2 languages / 22172
3 languages / 7432
4 languages / 3185
5 languages / 1721
6 languages / 3076
7 languages / 5050
total / 116782

Which gives the following graphic :

We can now study the relations between one language with the other ones. This table gives for each table the number of synsets where one of the other language is present too.

Bulgarian / Czech / Greek / Romanian / Turkish / Serbian
Bulgarian / 14511 / 10500 / 11858 / 8604 / 5802
Czech / 14511 / 11715 / 12899 / 9006 / 5883
Greek / 10500 / 11715 / 9772 / 8561 / 5731
Romanian / 11858 / 12899 / 9772 / 8090 / 5566
Turkish / 8604 / 9006 / 8561 / 8090 / 5328
Serbian / 5802 / 5883 / 5731 / 5566 / 5328

The following graphics show for one language the number of synsets common whit each other languages.

Quality of the equivalence relations by evaluating the performance of the WSD tool

In order to both evaluate the performance of the WSDtool and assess the accuracy of the interlingual linking of the Romanian WordNet to PWN2.0, we selected a bag of English target nouns, verbs, and adjectives extracted from the parallel corpus of George Orwell's 1984 so that all their senses (at least two per POS) defined in PWN2.0 were also included and interlingually aligned in the Romanian wordnet. This set contained 211 words which had 1810 occurrences in 1385 sentences of the English part of the parallel corpus.

To create a "gold standard" sense tagging for evaluation purposes, we manually sense-tagged all the occurrences ot the 211 target words. We then enlisted 13 students enrolled in the Computational Linguistics Masters program at the University "A.I. Cuza" of Iasi to manually assign senses to the same occurrences of the target words. An extraction script generated for each student a subset of the 1385 sentences containing occurrences of the targeted words.

The extraction process ensured that the same sentence was in at least three student-sets. With each of the 1810 occurrences of the target words disambiguated by at least three students, we computed a simple majority sense (MAJ) for each occurrence of the target words.

Disambiguation results for the same set of words were then generated by the WSDtool algorithm. The system was unable to make a decision for 398 of the 1810 occurrences, primarily in cases where the occurrence was not translated in the Romanian text or was incorrectly determined by the word-aligner. An evaluation program was then applied that generated a file containing detailed information for each of the 1810 occurrences, including

  • the sense number for that occurrence in the gold standard (GS)
  • the majority sense assigned by the student annotators (MAJ)
  • the sense assigned by the algorithm(ALG)
  • the names of the students who evaluated the occurrence and the sense(s) they assigned

For comparison purposes, we took into account only the 1412 occurrences that were sense disambiguated by the algorithm. Table 3 summarizes the results.

It is interesting to note that the agreement between the algorithm and the gold standard is higher than between the majority vote of the students and the gold standard.

GS=MAJ / 73.22%
GS=ALG / 78.68%
MAJ=ALG / 67.13%
GS=MAJ=ALG / 62.32%

Table 3. WSD agreements (without back-off mechanism)

Conclusion

This report summarizes the main results of a detailed quantitative analysis concerning the coverage and intersection across the monolingual Balkan Wordnets, currently under development. Obtained results indicate that all Wordnets display a significant degree of general vocabulary coverage for the languages involved, which goes well beyond the initial expectations set at the early stages of the work. As an only exception to this comes the Serbian Wordnet which again, started being developed in a later stage, manage to encompass the envisaged numbers and tries to keep up with the evolution of the other Wordnets.

In particular, concerning the terminological coverage of Wordnets we have shown that a set of Common terms have been encoded across all Wordnets, ensuring this way a satisfactory degree on overlap. Moreover, some language specific concepts have also been incorporated to some extend and are currently been enriched in other to enable the lexicalize concepts not shared across languages. Besides terminology, there is also a significant coverage to the types of lexical relations encoded within Wordnets resulting thus into rich knowledge sources. Most importantly, all Wordnets have been fully linked to their English equivalent so that navigation between their contents is efficiently made available.

In the future it is planned that the monolingual Wordnets are enriched with new terminology, mostly language specific one, and with domain knowledge information. Moreover, the qualitative evaluation of each monolingual Wordnet is an ingoing task which aspires at delivering a good quality Balkan semantic network at the end of the project.

1