计算语言学学术讲座

Lexicons, ontologies and related issued:

towards a new generation of Language Resources

Speaker:Nicoletta Calzolari

Istituto di Linguistica Computazionale del CNR,

Pisa, Italy

时间:2007年9月3日(星期一)下午3:45-5:15

清华大学FIT大楼1-315(进清华东门左边大楼即是。1-315在3层)

[演讲摘要]

I'll touch issues related to: language resources (LR) and semantics, dynamic resources automatically acquired, interoperability among LRs and Language Technology (LT), and how to go for a new generation of LRs in the Semantic Web (SW) framework, pointing at the potentialities and the need of a cross-fertilisation between the two communities of Human Language Technology (HLT) and SW.

Large scale LRs are unanimously recognised as the necessary infrastructure underlying LT. Discussing a few major European initiatives for building harmonised lexicons, I will highlight how computational lexicons and textual corpora should be considered as complementary views on the lexical space, in the perspective of modelling a new type of resource which is both a lexicon and a corpus together. A ‘complete’ computational lexicon should incorporate and represent our ‘knowledge of the world’. We claim that it is theoretically impossible to achieve completeness within any ‘static’ lexicon. Moreover, choices on the syntagmatic axis are pervasive in language. A sound language infrastructure must encompass both ‘static’ lexicons, as the traditional ones, and ‘dynamic’ systems able to enrich the lexicon with information acquired on-line from large corpora, thus capturing the ‘actually realised’ potentialities, the large range of variation, and the flexibility inherent in the language as it is used. These are the challenges for semantic tagging, at the core of the SW vision.

Broadening our perspective into the future, the need of more and more ‘knowledge intensive’ large-size LRs for effective content processing requires a change in the paradigm, and the design of a new generation of LRs, based on open content interoperability standards. The SW notion is going to crucially determine the shape of the LRs of the future, consistent with the vision of an open distributed space of sharable knowledge available on the Web for processing.

I’ll also point at two quite new infrastructural initiatives – launched at the European Commission – in the area of LRs and LTs, which may have influence in how we shape the future of our field. These initiatives call for international cooperation also outside Europe, and may be relevant for setting up a global worldwide Forum for LRs and LTs.

[讲者简历]

Nicoletta Calzolari works in the field of Computational Linguistics since 1972, first as Researcher at the Department of Linguistics of PisaUniversity, then as Director of Research at ILC, Istituto di Linguistica Computazionale of CNR, Pisa.

Since August 2003 she is the Director of the Istituto di Linguistica Computazionale of the Italian Research Council (ILC-CNR), Pisa, Italy.

She has co-ordinated many international, European and national projects and strategic initiatives, mostly in the fields of Language Resources and Standardisation. She has more than 350 publications.

In addition to other editorial activities, she is Director of the Journal Linguistica Computazionale, IEPI, Pisa – Roma, and Co-editor with Nancy Ide of the new International Journal Language Resources and Evaluation, Springer. Conference chair of LREC2004, LREC2006, COLING/ACL2006, Italian TAL2006.

Main fields of interest are: Human Language Technology; computational lexicology and lexicography; language resources; corpus linguistics; standardisation and evaluation of language resources; lexical semantics and semantic annotation; collocations and multiwords; derivational morphology; knowledge acquisition from multiple (lexical and textual) sources, integration and representation; validation of language resources.

Prof. Calzolari is, among others, member and general secretary of ICCL, Vice-president of the ELRA Board, member of the ACL Executive Committee, President of the PAROLE Association, founding member of the Italian Forum for HLT at the Ministry of Communications, convenor of WG4 of ISO TC37 SC4, member of the Advisory Committee for the 21st Century COE (Center of Excellence) Program of Tokyo Institute of Technology, member of IULA (Barcelona) Advisory Committee, member of SENSEVAL Advisory Committee, member of many other International Committees and Advisory Boards.

Invited speaker, member of program committee or organiser for quite numerous international scientific conferences, workshops, etc.

欢迎参加!

联系人:孙茂松、刘知远

联系电话:13810325978