Computational Linguistics
This is the main page for this subject. Most materials for the course as well as announcements will be made available through here, so visit this page regularly.
The course has two main parts: a theoretical part and an applied part. Click on the headings to get the materials for any given topic. Materials include
Study and reference texts
External links to relevant sites and resources
Software
Other lab materials
Text materials in HTML have been designed to be printer-friendly. No gratuitous graphics or other fancy elements have been used. Remember that images are not part of HTML documents, so if you save pages with images, you'll also have to save these images (by right-clicking on the image and choosing "Save image as..." (or whatever your browser says).
As our lab has Windows NT machines, we will be working with Win32 sowftware. This means that, apart from topic-specific tools, you will need:
A decent text editor (no, MS Notepad won't do!). Get TextPad now!
Ghostview to view and print postscipt documents. Download Ghostview from here (Win32 version).
Acrobat Reader for Viewing PDF files. Download from here.
A compressor/decompressor for Windows. Download and install Winzip.
--------------------------------------------------------------------------------
ASSIGNMENT
--------------------------------------------------------------------------------
Theory
1. Introduction
What is computational linguistics?
2. Language resources I: Text Corpora
Early Corpus Linguistics and the Chomskyan Revolution
What is a Corpus, and what is in it?
Quantitative Data
The Use of Corpora in Language Studies
Corpus Typology
3. Natural Language Analysis and Understanding
Get the whole secion in PDF format from here (zipped).
Representation
Grammars and constituent structure
Grammatical relations
Meaning
Processing
Parsing
Generation
More on representation: Meaning
Semantics
Pragmatics
Real World Knowledge
4. An Introduction to Machine Translation
Get the whole secion in PDF format from here.
Why MT matters
Popular Conceptions and Misconceptions
A bit of history
--------------------------------------------------------------------------------
To Know More...
...about CL and NLP
Computational Linguistics: the journal
The Association for Computational Linguistics (ACL)
La Asociación Española para el Procesamiento del Lenguaje Natural (SEPLN)
...about Corpus Linguistics
The Bank of English (COBUILD)
The Birmingham Corpus Page
The British National Corpus
Michael Barlow's Home Page (lots of links)
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Practice
1. Computers?
For the totally clueless: That mouse won't bite you
Hardware Architectures and Operating Systems
Must-have DOS commands
An Introduction to MS Windows
What is the Internet?: You don't really need a board to surf!
2. Text Hacking
Text editing and text processing. Formats
The initiation: basic text editing techniques
Search and Replace: Regular Expressions
UNIX tools for text hacking
GNU grep
GNU sed
Other text-related Unix tools
Corpus techniques and tools
Techniques
Tools
3. Publishing Content.
Publishing for the web: HTML basics
Academic publishing: LaTeX basics
4. Analysing Natural Lamguage
English Morphology using PC-KIMMO
English Syntax using PC-PATR
5. Databases
Basic DB concepts.
Designing a lexical database
Using SQL
6. Programming in Prolog
What is Prolog?
Know the language
Doing syntax in Prolog (DCG)
Processing word meaning
--------------------------------------------------------------------------------
To know more...
...about computers
More about the Internet (history, technical details)
A Gazetteer and Dictionary for DOS 6.22
...about text hacking
Project Gutenberg: free texts collecion
Mike Scott's Home Page (author of WordSmith)
BNC Indexer
Regular Expressions manual from NASA (comprehensive)
Regular expressions in Pel
--------------------------------------------------------------------------------
Copyright © 2001 Antonio Moreno Ortiz.
Dedicated to Corpus Research in China!
Dedicated to Corpus Research in China!
2007年5月16日星期三
订阅:
博文评论 (Atom)
- seanxpq@gmail.com QQ:553575272
- Research fields: Corpus Linguistics;SLA; Translation Studies
Time & Tide
Academic Links
- American National Corpus (ANC)
- AntConc concordancer
- Audience Dialogue
- Bank of English Polylexicon
- Birmingham Centre for Corpus Research
- BNC Variation in English Words and Phrases (VIEW)
- BNC-- British National Corpus
- British Academic Written English Corpus
- CHILDES - Child Language Data Exchange System
- Chun Yu KIT's Home Page
- CLaRK System for Corpora Development
- CLEC
- Cobuild Concordance and Collocations Sampler
- Concordancing Programs
- Corpus BROWN
- Corpus del español [DAVIES-NEH]
- Corpus4U.org
- CorpusSearch 2: a tool for linguistic research
- Corsis (formerly Tenka Text)
- Dexter--Tools for analyzing language data
- Edict Web Concordancer
- ELISA Concordancer
- ENGCG Constraint Grammar Parser of English
- English (ESL/EFL) Learning Websites
- English Corpus Studies (JP)
- FLOB-Frown Corpus Web Concordancer
- GlossaNet online concordancer
- HK Bilingual Corpus of Legal and Documentary Texts
- HK Corpuslinguist
- HK English-Chinese Parallel Concordancer
- HK Parallel Texts Viewer and Concordancer
- Ke Ping's Site
- keyword finder
- LDC-Online
- Leeds collection of Internet corpora
- LEXA: Corpus Processing Software
- Life,Learning & Entertainment
- Linguistic Studies
- MICASE Concordance Search
- Michigan Corpus of Academic Spoken English (MICASE)
- Michigan Corpus of Upper-level Student Papers (MICUSP)
- MicroConcord (Academic) Search
- MicroConcord (Journalistic) Search
- MicroConcord Corpus Collections
- MicroConcord,WebQuiz,Dropper,Word Splitter
- NTNU Concordancer and Collocation Retrieval System
- Parallel Texts Viewer and Concordancer (HK)
- Paul Nation
- Phrases in English
- PLUG Word Aligner (PWA)
- Provalis Research
- SCOTS Project - Scottish Corpus of Texts and Speech
- Simple Concordance Program
- TAIParse Part-of-Speech (POS) Tagger
- TAPoR Text Analysis Software
- Text Content Analyser - UsingEnglish.com
- TextLadder
- TextQuest
- The Compleat Lexical Tutor
- The Content Analysis Guidebook Online
- The English-Chinese Parallel Concordancer
- Thomas Michael Cobb
- Tim Johns Data-driven Learning Page
- Tony McEnery and Andrew Wilson
- WordCruncher
- WordHoard
- WordPilot 2002
- WordSmith Tools
- 名著英汉对照
- 红楼梦翻译研究
- 话语分析研究博客
- 语料天涯
- 语料库语言学与英语教育教学
News-Press
Links & Resources
The Internet TESL Journal - Articles, Research Papers, Lessons Plans, Classroom Handouts, Teaching Ideas & Links
Activities for ESL Students - Free online quizzes, exercises and puzzles to help you study English
没有评论:
发表评论