I’m a PhD student @ LORIA, University of Lorraine, France, working within Orpailleur team. Working among discourse and data mining, my work aims at considering the discourse or argumentation structure of the texts in the process of text mining. I’m interested in the representation of documents, especially by combining word or sentence embeddings produced by neural models and discourse structure annotated by humans.
Recent Publications
An Alignment Cost-Based Classification of Log Traces Using Machine-Learning
Conformance checking is an important aspect of process mining that identifies the differences between the behaviors recorded in a log and those exhibited by an associated process model. Machine learning and deep learning methods perform extremely well in sequence analysis. We successfully apply both a Recurrent Neural Network and a Random Forest classifier to the problem of evaluating whether the alignment cost of a log trace to a process model is below an arbitrary threshold, and provide a lower bound for the fitness of the process model based on the classification.
read more
Do Sentence Embeddings Capture Discourse Properties of Sentences from Scientific Abstracts
We introduce four tasks designed to determine which sentence encoders best capture discourse properties of sentences from scientific abstracts, namely coherence between clauses of a sentence, and discourse relations within sentences. We show that even if contextual encoders such as BERT or SciBERT encodes the coherence in discourse units, they do not help to predict three discourse relations commonly used in scientific abstracts. We discuss what these results underline, namely that these discourse relations are based on particular phrasing that allow non-contextual encoders to perform well.
read more
Alignement de Structures Argumentatives et Discursives par Fouille de Graphes et de Redescriptions
Dans cet article, nous étudions la similarité entre structures argumentatives et discursives en alignant des sous-arbres dans un corpus annoté en RST et en structure argumentative. Contrairement aux travaux précédents, nous nenous intéressons pas uniquement à un alignement relation à relation, mais à unalignement de sous-structures. À l’aide de méthodes de fouille de données, nous montrons que des similitudes existent entre l’argumentation et le discours. L’annotation multiple du corpus permet également de proposer un alignement entreles structures.
read more