Text analysis software to help learners write in French

Abstract : New text analysis software developed thanks to research in areas such as Machine Learning and Natural Language Processing is also useful in language theory and research. Littératron is a new data-processing tool for automatic syntactic pattern extraction that was designed at the LIP6 laboratory by Jean-Gabriel Ganascia. By syntactic pattern we mean an association of coherent linguistic units. More exactly, the inputs of Littératron are syntactic analysis trees, provided by a linear text analyzer, and its outputs being recurrent syntactic patterns. In addition, Littératron is able to compare several texts in order to detect which syntactic pattern is present in one text and absent from another. It is this kind of discrimination which issued to help build the characteristics of learner writing. Littératron performs a learner cognitive diagnosis by analysising the linguistic style of the written French of learners of French as a foreign language.
Our experiments were based on certification in French as a foreign language following the guidelines laid down by ALTE (Association of Language Testers in Europe). The narrative framework of writing tests is postcards, friendly letters and personal statement. Because we work with written productions with strong textual models, comparing these productions allows us to detect syntactic particularities (linguistic and stylistic mistakes for example).
The acquisition of French as a foreign language is studied here by comparing the nature and frequency of extracted syntactic patterns taken from the written production of learners and native speakers. The learner may come from a heterogeneous group (different language levels and different mother tongues) or from a homogeneous group (only one language level and one mother tongue, here Arabic). We found decisive patterns in these two groups of learners.
In the aim of computer-assisted language learning, a cognitive diagnosis can be carried out using Littératron to study stylistic figures that are counted using recurrent patterns. The diagnosis is built from the stylistic mistakes taken from the learners' written work. Our software extracts a set of characteristic figures concerning noun expansion for instance, or the use of adjectives, adverbs, punctuation, etc. It then analyses the mistakes or the over-use of certain expressions using these figures.
Future developments might be to transfer the data to the inference module, which is based on a rule database in order to establish the learner profile. This profile could be used as a diagnostic tool by the remote tutor. From the information provided by the system about the characteristics of the learner's mistakes, the tutor can make choices and point the learner to the learning activities that are best adapted to the learner's needs, as well as show him how to build a representation of his own learning and needs.
This approach can be of interest in three fields: language teaching, on a purely educational basis; computational linguistics; computer-assisted learning (as a tutoring tool).
Type de document :
Communication dans un congrès
Computer Assisted Language Learning 2006, 2006, Antwerp, Belgium. pp.26-35, 2006
Liste complète des métadonnées

Littérature citée [12 références]  Voir  Masquer  Télécharger

https://edutice.archives-ouvertes.fr/edutice-00087745
Contributeur : Isabelle Audras <>
Soumis le : mercredi 26 juillet 2006 - 15:07:41
Dernière modification le : vendredi 31 août 2018 - 09:25:55
Document(s) archivé(s) le : vendredi 13 mai 2011 - 23:38:39

Identifiants

  • HAL Id : edutice-00087745, version 1

Collections

TICE | UPMC | LIP6

Citation

Isabelle Audras, Jean-Gabriel Ganascia. Text analysis software to help learners write in French. Computer Assisted Language Learning 2006, 2006, Antwerp, Belgium. pp.26-35, 2006. 〈edutice-00087745〉

Partager

Métriques

Consultations de la notice

457

Téléchargements de fichiers

142