Author Archives: kidlgi2p

PhD proposal (granted)

IMT Mines Alès (Engineering School)
Laboratory and reseach team: KID (Knowledge and Image analysis for Decision making) / LGI2P (Center of Computer Science and Production Engineering)
Location: Alès, South of France (1h from Montpellier, 45min from Nîmes)
Doctoral school: I2S Information Structures Systems –
Discipline: Computer Science
Thesis supervisor: Sylvie Ranwez, Professor at IMT Mines Alès, Vincent Ranwez, Professor à Montpellier SupAgro, Nicolas Sutton Charani, Lecturer at IMT Mines Alès
Funding: IMT Mines Alès
Starting date: Autumn 2018
Application deadline: June, 15th 2018
Contact:, +33 (0) 434 246 262
IMT Mines Alès (Ecole nationale supérieure des mines d’Alès)
Centre et équipe de recherche : KID (Knowledge and Image analysis for Decision making) / LGI2P (Laboratoire de Génie Informatique et d’Ingénierie de Production) –
Localisation : Alès (1h de Montpellier, 45min de Nîmes)
Ecole doctorale : I2S Information Structures Systèmes –
Spécialité : Informatique (section CNU 27) / Computer Science
Encadrement : Sylvie Ranwez, Professeur à IMT Mines Alès, Vincent Ranwez, Professeur à Montpellier SupAgro, Nicolas Sutton Charani, Maître assistant à IMT Mines Alès.
Financement : Bourse école
Début de la thèse : Automne 2018
Date limite de candidature : 15 juin 2018
Contact :, 04 34 24 62 62


Ontologies are successfully used as semantic guides when navigating through the huge and ever increasing quantity of digital documents. Actually, they constitute the backbone of key functionalities such as indexing, retrieving, filtering and analyzing relevant information in a given context.
OntoToolkit aims at providing applications dedicated to ontology treatments that implement algorithmic solutions that we published.

MUD – Multiple Uncertainty Detection

MUD allows to detect uncertainty in natural language. It relies on a new supervised and generic approach based on the statistical analysis of multiple lexical and syntactic features used to characterize sentences through vector-based representations that can be analyzed by proven classification methods (like SVM).
You may found additional content in following publications:

  • “Uncertainty detection in natural language: a probabilistic model”. Pierre-Antoine Jean, Sébastien Harispe, Sylvie Ranwez, Patrice Bellot, Jacky Montmain. In Proceedings of the 6th International Conference on Web Intelligence, Mining and Semantics (WIMS’16), Rajendra Akerkar, Michel Plantié, Sylvie Ranwez, Sébastien Harispe, Anne Laurent, Patrice Bellot, Jacky Montmain, and François Trousset (Eds.). ACM International Conference Proceeding Series, New York, NY, USA, Article 10, 10 pages. DOI:, ISBN: 978-1-4503-4056-4, Nîmes, France, June 13-15 2016.
  • (in French only) “Un modèle probabiliste pour la détection de l’incertitude dans le langage naturel”. Pierre-Antoine Jean, Sébastien Harispe, Sylvie Ranwez, Patrice Bellot, Jacky Montmain. Actes de CORIA 2016

Source code available on GitHub.

Book Announcement: Semantic Similarity from Natural Language and Ontology Analysis

We are very pleased to announce to interested parties that our survey:
“Semantic Similarity from Natural Language and Ontology Analysis”
Sébastien Harispe, Sylvie Ranwez, Stefan Janaqi, and Jacky Montmain
Synthesis Lectures on Human Language Technologies, May 2015, Vol. 8, No. 1 , Pages 1-254
is now available.

Additional information on Morgan and Claypool Publishers website

Keywords : Semantic similarity, semantic relatedness, semantic measures, distributional measures, domain ontology, knowledge-based semantic measure.

USI – User-oriented Semantic Indexer

User-oriented Semantic Indexer is the name of an efficient algorithm for annotating documents of any type.

The main motivation behind this work is to provide a kNN-based approach for annotating entities, be it textual documents, songs or movies. While other methods often combine machine learning and feature analysis of a given document (e.g. textual features), USI’s approach is completely independent of the document content. The only requirement in order to guarantee an accurate annotation is to provide an accurate already annotated neighborhood. The search of a good neighborhood is an independent task, related to information retrieval, for which an extensive list of tools already exist.

Thanks to the rise of thesauri, ontologies and knowledge representations in general, there are more and more data that can be annotated by concepts. The semantic indexing process has been initiated in the biomedicine field but much more content can now benefit from conceptual indexing thanks to DBPedia or Freebase. USI aims to do so, whatever is the content, whatever is the thesaurus.

USI is presented as a heuristic algorithm optimizing an objective function. We propose an algorithmic optimization of this heuristic to make it fast enough, implemented in the USI java library. This library is also hosted on GitHub and it can be freely downloaded to be implemented in your project.

SML – Semantic Measures Library

The Semantic Measures Library and Toolkit are robust open source and easy to use software solutions dedicated to semantic measures. They can be used for large-scale computation and analysis of semantic similarities, proximities or distances between terms or concepts defi ned in knowledge representations, e.g., structured vocabularies, taxonomies, RDF graphs. The comparison of instances (e.g., documents, patient records, genes) annotated by concepts is also supported. An important aspect of these new solutions is that they are generic and are therefore not tailored to a speci c application context. They can thus be used with various controlled vocabularies and knowledge representation languages (e.g. OBO, RDF, OWL). The

project targets both designers and practitioners of semantic measures providing a JAVA source code library, as well as a command-line toolkit which can be used on personal computers or computer clusters.
The library implements a large collection of state-of-the-art measures and several parametric measures provide fine-grained tuning capabilities for speci c usage contexts. The Semantic Measures Library and Toolkit aim at equipping communities studying and using semantic measures with robust, reliable and efficient, open source, generic and easy to use tools dedicated to semantic measures. Downloads, documentations, updates and community support are available at

The Semantic Measures Library and Toolkit: fast computation of semantic similarity and relatedness using biomedical ontologies. 
Sebastien Harispe, Sylvie Ranwez, Stefan Janaqi, Jacky Montmain. Oxford Bioinformatics 2013.