Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Alignment-based profiling of Europarl data in an English-Swedish parallel corpus
Linköping University, Department of Computer and Information Science, NLPLAB - Natural Language Processing Laboratory. Linköping University, The Institute of Technology. (HCS)
2010 (English)In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10) / [ed] Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias, Paris, France: European Language Resources Association (ELRA) , 2010, p. 3398-3404Conference paper, Published paper (Refereed)
Abstract [en]

This paper profiles the Europarl part of an English-Swedish parallel corpus and compares it with three other subcorpora of the sameparallel corpus. We first describe our method for comparison which is based on alignments, both at the token level and the structurallevel. Although two of the other subcorpora contains fiction, it is found that the Europarl part is the one having the highest proportion ofmany types of restructurings, including additions, deletions and long distance reorderings. We explain this by the fact that the majorityof Europarl segments are parallel translations.

Place, publisher, year, edition, pages
Paris, France: European Language Resources Association (ELRA) , 2010. p. 3398-3404
Keywords [en]
parallel corpora, profiling, translation, English, Swedish
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:liu:diva-60039ISI: 000356879508030ISBN: 2-9517408-6-7 (print)OAI: oai:DiVA.org:liu-60039DiVA, id: diva2:354794
Conference
7th International Conference on Language Resources and Evaluation (LREC)
Available from: 2010-10-05 Created: 2010-10-04 Last updated: 2018-01-12Bibliographically approved

Open Access in DiVA

fulltext(418 kB)466 downloads
File information
File name FULLTEXT01.pdfFile size 418 kBChecksum SHA-512
7c9d6708234586911bebce1d9a45ab34cb16998499791977f1ab10957488d3bcf4141054db87e45e00844232ad3e22eed5496fd7e359e726d3fe48c5ef0100b5
Type fulltextMimetype application/pdf

Other links

Link to conference

Search in DiVA

By author/editor
Ahrenberg, Lars
By organisation
NLPLAB - Natural Language Processing LaboratoryThe Institute of Technology
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 467 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 654 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf