Our systems are now restored following recent technical disruption, and we’re working hard to catch up on publishing. We apologise for the inconvenience caused. Find out more

Recommended product

Popular links

Popular links


Similar Languages, Varieties, and Dialects

Similar Languages, Varieties, and Dialects

Similar Languages, Varieties, and Dialects

A Computational Perspective
Marcos Zampieri, Rochester Institute of Technology, New York
Preslav Nakov, HBKU, Doha, Qatar
August 2021
This ISBN is for an eBook version which is distributed on our behalf by a third party.
Adobe eBook Reader
9781316998946
$88.99
USD
Adobe eBook Reader
CAD
Hardback

    Language resources and computational models are becoming increasingly important for the study of language variation. A main challenge of this interdisciplinary field is that linguistics researchers may not be familiar with these helpful computational tools and many NLP researchers are often not familiar with language variation phenomena. This essential reference introduces researchers to the necessary computational models for processing similar languages, varieties, and dialects. In this book, leading experts tackle the inherent challenges of the field by balancing a thorough discussion of the theoretical background with a meaningful overview of state-of-the-art language technology. The book can be used in a graduate course, or as a supplementary text for courses on language variation, dialectology, and sociolinguistics or on computational linguistics and NLP. Part 1 covers the linguistic fundamentals of the field such as the question of status and language variation. Part 2 discusses data collection and pre-processing methods. Finally, Part 3 presents NLP applications such as speech processing, machine translation, and language-specific issues in Arabic and Chinese.

    • Features chapters written by well-known researchers in Dialectology, Language Variation, Sociolinguistics, Computational Linguistics, and Natural Language Processing

    Reviews & endorsements

    ‘Variation is a key aspect of human language, and yet it has been too often overlooked in computational linguistics. The book edited by Marcos Zampieri and Preslav Nakov is an important step towards filling this gap with top-level contributions that offer a new alliance between natural language processing and linguistic theory to understand this complex phenomenon and its impact on applications.’ Alessandro Lenci, University of Pisa

    See more reviews

    Product details

    August 2021
    Adobe eBook Reader
    9781316998946
    0 pages
    1 colour illus.
    This ISBN is for an eBook version which is distributed on our behalf by a third party.

    Table of Contents

    • Introduction Marcos Zampieri and Preslav Nakov
    • Part I: Language variation James Walker
    • Phonetic variation in dialects Rachael Tatman
    • 3. Similar languages, varieties and dialects Miriam Meyerhoff and Steffen Klaere
    • 4. Mutual intelligibility Charlotte Gooskens and Vincent J. van Heuven
    • 5. Dialectology for computational linguists John Nerbonne, Wilbert Heeringa, Jelena Prokić and Martijn Wieling
    • Part II:
    • 6. Data collection and representation for similar languages, varieties and dialects Tanja Samardžić and Nikola LjubeÅ¡ić
    • 7. Adaptation of morphosyntactic taggers Yves Scherrer
    • 8. Sharing dependency parsers between similar languages Željko Agić
    • Part III:
    • 9. Dialect and similar language identification Marcos Zampieri
    • 10. Dialect variation on social media Dong Nguyen
    • 11. Machine translation between similar languages Preslav Nakov and Jorg Tiedemann
    • 12. Automatic spoken dialect identification Pedro Torres-Carrasquillo and Bengt Borgström
    • 13. Arabic dialect processing Nizar Habash
    • 14. Automatic classification of varieties of Mandarin Chinese Hongzhi Xu, Menghan Jiang, Jingxia Lin, Dingxu Shi and Chu-Ren Huang.
      Contributors
    • Marcos Zampieri, Preslav Nakov, James Walker, Rachael Tatman, Miriam Meyerhoff, Steffen Klaere, Charlotte Gooskens, Vincent J. van Heuven, John Nerbonne, Wilbert Heeringa, Jelena Prokić, Martijn Wieling, Tanja Samardžić, Nikola LjubeÅ¡ić, Yves Scherrer, Željko Agić, Dong Nguyen, Jorg Tiedemann, Pedro Torres-Carrasquillo, Bengt Borgström, Nizar Habash, Hongzhi Xu, Menghan Jiang, Jingxia Lin, Dingxu Shi, Chu-Ren Huang

    • Editors
    • Marcos Zampieri , University of Cologne, Germany

      Dr. Marcos Zampieri is an assistant professor at the Rochester Institute of Technology, where he teaches courses in linguistics and natural language processing. He received his PhD for Saarland University in Germany with a thesis on computational models applied to pluricentric languages. Dr. Zampieri is one of the organizers of the well-established VarDial workshop series on NLP for Similar Languages, Varieties, and Dialects. His research deals with the application of computational models to large collections of texts. He has worked on a variety of topics including language acquisition and variation, (machine) translation and post-editing, and social media mining.

    • Preslav Nakov , HBKU, Doha, Qatar

      Dr. Preslav Nakov is Principal Scientist at Qatar Computing Research Institute at Hamad Bin Khalifa University. He leads the Tanbih mega-project, developed in collaboration with MIT. He co-authored a book on Semantic Relations between Nominals, two books on computer algorithms, and many research papers in top-tier conferences and journals. He received the Young Researcher Award at RANLP'2011. He was also the first to receive the Bulgarian President's John Atanasoff award, named after the inventor of the first automatic electronic digital computer. Dr. Nakov's research was featured in over 100 news outlets, including Forbes, Boston Globe, and MIT Technology Review.