Syntactic Analysis - project
For the 7.5 hp parsing course you have do a project, equaivalent to 2.5 hp, or approximately 1.7 weeks of full time work.
The project can be done either individually or in pairs. If you wish to do the projects in pairs, it is your responsibility to find someone to work with. The scope of a pair project has to be larger than for an individual project.
The project has to be related to parsing, but you can choose the topic quite freely. You may work either on phrase structure parsing or dependency parsing, but the schedule may require some individual study ahead of the schedule if you want to focus on dependency parsing.
Here are some tentative ideas for the project:
- Evaluation project. Evaluate one or more parsers. Here you have to decide which language(s) you are interested in, which parser(s) you want to evaluate, what type of text domain(s), and what type of evaluation you will perform.
- Implement a parser/parser component. One option is to use one or more of the VG tasks in assignment 1 as a starting point, and considerably extend it/them. Note that the project should not overlap with the VG task you might have done. You may also consider implementing Earley's algorithm (as a recognizer is typically enough for the size of the project). It is also possible to come up with another plan for implementing a parser or parser component.
- Treebank transformations. Investigate the effect of different types of treebank transformations on parsing. This article can be a good starting point: Mark Johnson. PCFG Models of Linguistic Tree Representations. Computational Linguistics 24(4). Pages 613-632. In addition, J&M 14.9 might be relevant.
- Network engineering for parsing. Investigate the effect of different variations in the neural network used in a modern consituency or dependency parser. Note that this task is technically advanced, and it is not recommended unless you have an understanding of neural networks gained from outside of our program.
- Feature engineering for dependency parsing. Investigate the effect of different types of features and transition systems used for learning the best transitions in an "old school" dependency parser, for example MaltParser.
- Cross-lingual dependency parsing with UUparser and UD treebanks. To get started you can check out an old assignment on the theme: Cross-lingual dependency parsing (Note: this link was updated to support UD 2.9 on Feb 16). There you can see how you can run UUparser on our Linux system, and get some inspiration. It is recommended that you run few-shot experiments rather than zero-shot. A suitable project might be along the lines of a VG task, where you state a hypothesis, design some experiments, and run the evaluation. In addition, we would expect you to have a description in your report of how UUparser works on a basic level.
- Your own proposal
Project proposal and groups
Before starting the project you need to decide if you are working alone, or find a peer to work with in a pair. Sign up for a group in Studium, either individually or in pairs. This is needed in order to hand in your proposal and report. Please do not sign up with another student unless you have already decided that you want to work together.
You will first write a project proposal of around 1/2 A4-page, where you describe what you intend to do in your project. The deadline for the project proposal is February 25.
The project should be reported in a final report (pdf) describing what you have done in your project and relating it to the parsing literature. If your project included implementation you should also hand in your code. If you have more than one code file, please zip them. Depending on the specific project the length and content of the report will vary.
If you work in a pair, you will also have to present your work in a short informal meeting with your supervisor. Note that both persons in the pair should know and understand the full project, even if you divded the work between you to some extent.
- Project proposal: February 25
- Oral project report (for pairs): March 23 (time slot to be decided)
- Written project report: March 25
If you have any questions about the project proposal, please contact Sara.