ASMA: A System for Automatic Segmentation and Morpho-Syntactic
Disambiguation of Modem Standard Arabic
Muhammad Abdul-Mageed, Mona Diab and Sandra Kübler ........... 1
Optimising Tree Edit Distance with Subtrees for Textual
Entailment
Maytham Alabbas and Allan Ramsay ............................. 9
Opinion Learning from Medical Forums
Tanveer Ali, Marina Sokolova, David Schramm and Diana
Inkpen ...................................................... 18
Annotating Events, Time and Place Expressions in Arabic Texts
Hassina Aliane, Wassila Guendouzi and Amina Mokrani ......... 25
A Semi-supervised Learning Approach to Arabic Named Entity
Recognition
Maha Althobaiti, Udo Kraschwitz and Massimo Poesio .......... 32
An NLP-based Reading Tool for Aiding Non-native English Readers
Mahmoud Azab, Ahmed Salama, Kemal Oflazer, Hideki Shima,
Jun Araki and Teruko Mitamura ............................... 41
Improving Sentiment Analysis in Twitter Using Multilingual
Machine Translated Data
Alexandra Balahur and Marco Turchi .......................... 49
Domain Adaptation for Parsing
Eric Baucom, Levi King and Sandra Kübler .................... 56
Towards a Structured Representation of Generic Concepts and
Relations in Large Text Corpora
Archana Bhattarai and Vasile Rus ............................ 65
Authorship Attribution in Health Forums
Victoria Bobicev, Marina Sokolova, Khaled El Emam and Stan
Matwin ...................................................... 74
TwitlE: An Open-Source Information Extraction Pipeline for
Microblog Text
Kalina Bontcheva, Leon Derczynski, Adam Funk, Mark
Greenwood, Diana Maynard and Niraj Aswani ................... 83
A Unified lexical Processing Framework based on the Margin
Infused Relaxed Algorithm. A Case Study on the Romanian
Language
Tiberiu Boroş ............................................... 91
Automatic Extraction of Contextual Valence Shifters.
Noémi Boubel, Thomas François and Hubert Naets .............. 98
Grammar-Based Lexicon Extension for Aligning German Radiology
Text and Images
Claudia Bretschneider, Sonja Zillner and Matthias Hammon ... 105
Recognising and Interpreting Named Temporal Expressions
Matteo Brucato, Leon Derczynski, Hector Llorens, Kalina
Bontcheva and Christian S. Jensen .......................... 113
Unsupervised Improving of Sentiment Analysis Using Global
Target Context
Tomáš S. Brychcín and Ivan Habernal ........................ 122
An Agglomerative Hierarchical Clustering Algorithm for
Labelling Morphs
Burcu Can and Suresh Manandhar ............................. 129
Temporal Text Classification for Romanian Novels set in the
Past
Alina Maria Ciobanu, Liviu P. Dinu, Octavia-Maria Şulea,
Anca Dinu and Vlad Niculae ................................. 136
A Dictionary-Based Approach for Evaluating Orthographic
Methods in Cognates Identification
Alina Maria Ciobanu and Liviu Petrisor Dinu ................ 141
A Pilot Study on the Semantic Classification of Two German
Prepositions: Combining Monolingual and Multilingual Evidence
Simon Clematide and Manfred Klenner ........................ 148
Semantic Relations between Events and their Time, Locations
and Participants for Event Conference Resolution
Agata Cybulska and Piek Vossen ............................. 156
Sense Clustering Using Wikipedia
Bharath Dandala, Chris Hokamp, Rada Mihalcea and Razvan
Bunescu .................................................... 164
Effective Spell Checking Methods Using Clustering Algorithms
Renato Cordeiro de Amorim and Marcos Zampieri .............. 172
Normalization of Dutch User-Generated Content
Orpheacute;e De Clercq, Sarah Schulz, Bart Desmet,
Els Lefever and Véronique Hoste ............................ 179
Linguistic Profiling of Texts Across Textual Genres and
Readability Levels. An Exploratory Study on Italian Fictional
Prose
Felice Dell'Orletta, Simonetta Montemagni and Giulia
Venturi .................................................... 189
Twitter Part-of-Speech Tagging for All: Overcoming Sparse
and Noisy Data
Leon Derczynski, Alan Ritter, Sam Clark and Kalina
Bontcheva .................................................. 198
Weighted Maximum Likelihood Loss as a Convenient Shortcut to
Optimizing the F-measure of Maximum Entropy Classifiers
Georgi Dimitroff, Laura Toloşi, Borislav Popov and Georgi
Georgiev ................................................... 207
Sequence Tagging for Verb Conjugation in Romanian
Liviu Dinu, Octavia-Maria Şulea and Vlad Niculae ........... 215
A Tagging Approach to Identify Complex Constituents for Text
Simplification
Iustin Dornescu, Richard Evans and Constantin Orasan ....... 221
Automatic Evaluation Metric for Machine Translation that is
Independent of Sentence Length
Hiroshi Echizen'ya, Kenji Araki and Eduard Hovy ............ 230
Acronym Recognition and Processing in 22 languages
Maud Ehrmann, Leonida della Rocca, Ralf Steinberger and
Hristo Tannev .............................................. 237
An Evaluation Summary Method Based on a Combination of
Content and Linguistic Metrics
Samira Ellouze, Maher Jaoua and Lamia Hadrich Belguith ..... 245
Hierarchy Identification for Automatically Generating
Table-of-Contents
Nicolai Erbs, Iryna Gurevych and Torsten Zesch ............. 252
Temporal Relation Classification in Persian and English
contexts
Mahbaneh Eshaghzadeh Torbati, Gholamreza Ghassem-sani,
Seyed Abolghasem Mirroshandel, Yadollah Yaghoobzadeh and
Negin Karimi Hosseini ...................................... 261
The Extended Lexicon: Language Processing as Lexical
Description
Roger Evans ................................................ 270
Did I Really Mean That? Applying Automatic Summarisation
Techniques to Formative Feedback
Debora Field, Stephen Pulman, Nicolas Van Labeke, Denise
Whitelock and John Richardson .............................. 277
Matching Sets of Parse Trees for Answering Multi-sentence
Suestions
Boris Galitsky, Dmitry Dvovsky, Sergei O. Kuznetsov and
Fedor Strok ................................................ 285
Realization of Common Statistical Methods in Computational
Linguistics with Functional Automata
Stefan Gerdjikov, Petar Mitankin and Vladislav Nenchev ..... 294
Mining Fine-grained Opinion Expressions with Shallow Parsing
Sucheta Ghosh, Sara Tonelli and Richard Johansson .......... 302
Justifying Corpus-Based Choices in Referring Expression
Generation
Helmut Horacek ............................................. 311
A Boosting-based Algorithm for Classification of Semi-
Structured Text using the Frequency of Substructures
Tomoya Iwakura ............................................. 319
Headerless, Quoteless, but not Hopeless? Using Pairwise
Email Classification to Disentangle Email Threads
Emily Jamison and Iryna Gurevych ........................... 327
Using Parallel Corpora for Word Sense Disambiguation
Dimitar Kazakov and Ahmad R. Shahid ........................ 336
Semantic Relation Recognition within Polish Noun Phrase:
A Rule-based Approach
Pawel Kedzia and Marek Maziarz ............................. 342
Unsupervised Induction of Arabic Root and Pattern Lexicons
using Machine Learning
Bilal Khaliq and John Carroll .............................. 350
Towards Domain Adaptation for Parsing Web Data
Mohammad Khan, Markus Dickinson and Sandra Kübler .......... 357
Capturing Anomalies in the Choice of Content Words in
Compositional Distributional Semantic Space
Ekaterina Kochmar and Ted Briscoe .......................... 365
Incremental and Predictive Dependency Parsing under Real-
Time Conditions
Arne Köhn and Wolfgang Menzel .............................. 373
Rationale, Concepts, and Current Outcome of the Unit Graphs
Framework
Maxime Lefrançois and Fabien Gandon ........................ 382
The Unit Graphs Framework: Foundational Concepts and
Semantic Consequence
Maxime Lefrançois and Fabien Gandon ........................ 389
Confidence Estimation for Knowledge Base Population
Xiang Li and Ralph Grishman ................................ 396
Towards Fine-grained Citation Function Classification
Xiang Li, Yifan He, Adam Meyers and Ralph Grishman ......... 402
Supervised Morphology Generation Using Parallel Corpus
Alireza Mahmoudi, Mohsen Arabsorkhi and Heshaam Faili ...... 408
Sentiment Analysis of Reviews: Should we Analyze Writer
Intentions or Reader Perceptions?
Isa Maks and Piek Vossen ................................... 415
Revisiting the Old Kitchen Sink: Do we Need Sentiment Domain
Adaptation?
Riham Mansour, Nesma Refaei, Michael Gamon, Ahmed Abdul-
Hamid and Khaled Sami ...................................... 420
Evaluation of Baseline Information Retrieval for Polish
Open-domain Question Answering System
Michał Marcińczuk, Adam Radziszewski, Maciej Piasecki,
Dominik Piasecki and Marcin Ptak ........................... 428
WCCL Relation - a Toolset for Rule-based Recognition of
Semantic Relations Between Named Entities
Michał Marcińczuk .......................................... 436
Beyond the Transfe.r-and-Me.rge Wordnet Construction:
plWordNet and a Comparison with WordNet
Marek Maziarz, Maciej Piasecki, Ewa Rudnicka and Stan
Szpakowicz ................................................. 443
History Based Unsupervised Data Oriented Parsing
Mohsen Mesgar and Gholamreza Ghasem-Sani ................... 453
Contrasting and Corroborating Citations in Journal Articles
Adam Meyers ................................................ 460
CCG Categories for Distributional Semantic Models
Paramita Mirza and Raffaella Bernardi ...................... 467
Discourse-aware Statistical Machine Translation as
a Context-sensitive Spell Checker
Behzad Mirzababaei, Heshaam Faili and Nava Ehsan ........... 475
Cross-Lingual Information Retrieval and Semantic
Interoperability for Cultural Heritage Repositories
Johanna Monti, Mario Monteleone, Maria Pia di Buono and
Federica Marano ............................................ 483
Improving Web 2.0 Opinion Mining Systems Using Text
Normalisation Techniques
Alejandro Mosquera and Paloma Moreda Pozo .................. 491
Identifying Social and Expressive Factors in Request Texts
Using Transaction/Sequence Model
Daša Munková, Michal Munk and Zuzana Fráterová ............. 496
Parameter Optimization for Statistical Machine. Translation:
It Pays to Learn from Hard Examples
Preslav Nakov, Fahad Al Obaidli, Francisco Guzman and
Stephan Vogel .............................................. 504
Automatic Cloze-Questions Generation
Annamaneni Narendra, Manish Agarwal and Rakshit shah ....... 511
High-Accuracy Phrase Translation Acquisition Through Battle-
Royale Selection
Lionel Nicolas, Egon W. Stemle, Klara Kranebitter and
Verena Lyding .............................................. 516
Enriching Patent Search with External Keywords:
a Feasibility Study
Ivelina Nikolova, Irina Temnikova and Galia Angelova ....... 525
A Clustering Approach for Translationese Identification
Sergiu Nisioi and Liviu P. Dinu ............................ 532
PurePos 2.0: a Hybrid Tool for Morphological Disambiguation
György Orosz and Attila Novák .............................. 539
More than Bag-of-Words: Sentence-based Document
Representation for Sentiment Analysis
Georgios Paltoglou and Mike Thelwall ....................... 546
Information Spreading in Expanding Wordnet Hypernymy
Structure
Maciej Piasecki, Radoslaw Ramocki and Michal Kaliński ...... 553
Context Independent Term Mapper for European Languages
Mārcis Pinnis .............................................. 562
Semi-supervised vs. Cross-domain Graphs for Sentiment
Analysis
Natalia Ponomareva and Mike Thelwall ....................... 571
Towards a Hybrid Rule-based and Statistical Arabic-French
Machine Translation System
Fatiha Sadat ............................................... 579
Segmenting vs. Chunking Rules: Unsupervised ITG Induction
via Minimum Conditional Description Length
Markus Saers, Karteek Addanki and Dekai Wu ................. 584
A Combined Pattern-based and Distributional Approach for
Automatic Hypernym Detection in Dutch.
Gwendolyn Schropp, Els Lefever and Véronique Hoste ......... 593
Exploiting Synergies Between Open Resources for German
Dependency Parsing, POS-tagging, and Morphological Analysis
Rico Sennrich, Martin Volk and Gerold Schneider ............ 601
Using a Weighted Semantic Network for Lexical Semantic
Relatedness
Reda Siblini and Leila Kosseim ............................. 610
A New Approach to the POS Tagging Problem Using Evolutionary
Computation
Ana Paula Silva, Arlindo Silva and Irene Rodrigues ......... 619
How Joe and Jane Tweet about Their Health: Mining for
Personal Health Information on Twitter
Marina Sokolova, Stan Matwin, Yasser Jafer and David
Schramm .................................................... 626
What Sentiments Can Be Found in Medical Forums?
Marina Sokolova and Victoria Bobicev ....................... 633
Automated Learning of Everyday Patients Language for Medical
Blogs Analytics
Giovanni Stilo, Moreno De Vincenzi, Alberto E. Tozzi and
Paola Velardi .............................................. 640
How Symbolic Learning Can Help Statistical Learning (and
vice versa)
Isabelle Tellier and Yoann Dupont .......................... 649
Measuring Closure Properties of Patent Sublanguages
Irina Temnikova, Negacy Hailu, Galia Angelova and
K. Bretonnel Cohen ......................................... 659
Closure Properties of Bulgarian Clinical Text
Irina Temnikova, Ivelina Nikolova, William
A. Baumgartner, Galia Angelova and K. Bretonnel
Cohen ......................................................... 667
Analyzing the Use of Character-Level Translation with Sparse
and Noisy Datasets
Jörg Tiedemann and Preslav Nakov ........................... 676
A Feature Induction Algorithm with Application to Named
Entity Disambiguation
Laura Toloşi, Valentin Zhikov, Georgi Georgiev and
Borislav Popov ............................................. 685
Introducing a Corpus of Human-Authored Dialogue Summaries in
Portuguese
Norton Trevisan Roman, Paul Piwek, Ariadne M.B. Rizzoni
Carvalho and Alexandre Rossi Alvares ....................... 692
Wikipedia as an SMT Training Corpus
Dan Tufiş, Radu Ion, Ştefan Dumitrescu and Dan Ştefănescu .. 702
DutchSemCor: in Quest of the Ideal Sense-tagged Corpus
Piek Vossen, Rubén Izquierdo and Attila Görög .............. 710
Towards Detecting Anomalies in the Content of Standardized
IMF Dictionaries
Wafa Wali, Bilel Gargouri and Abdelmajid Ben Hamadou ....... 719
Edit Distance: A New Data Selection Criterion for Domain
Adaptation in SMT
Longyue Wang, Derek F. Wong, Lidia S. Chao, Junwen Xing,
Yi Lu and Isabel Trancoso .................................. 727
Automatic Enhancement of LTAG Treebank
Farzaneh Zarei, Ali Basirat, Hesham Faili and Maryam
S. Mirian .................................................. 733
Inductive and Deductive. Inferences in a Crowdsourced
Lexical-Semantic Network
Manel Zarrouk, Mathieu Lafourcade and Alain Joubert ........ 740
Machine teaming for Mention Head Detection in Multilingual
Coreference Resolution
Desislava Zhekova and Sandra Kübler ........................ 747
Combining POS Tagging, Dependency Parsing and Coreferential
Resolution for Bulgarian
Valentin Zhikov, Georgi Georgiev, Kiril Simov and Petya
Osenova .................................................... 755
magyarlanc: A Tool for Morphological and Dependency Parsing
of Hungarian
János Zsibrita, Veronika Vincze and Richárd Farkas ......... 763
|