Collaborative design of formal vocabularies supported by argumentation theory and language processing


With the deployment of the Semantic Web and Linked Open Data we have not only multiplied the data sources but also the vocabularies that structure and constrain the interpretations of these data. These vocabularies can be ontologies (RDF Schema diagrams, OWL) or thesauri (SKOS) sometimes augmented with additional rules (N3 or SPARQL/SPIN rules, SWRL, RIF) and constraints (RDF Shape, SHACL). Vocabularies directories now exist (e.g. LOV) but there is an ever increasing demand for environments that simplify editing and collaborative contribution to the vocabularies by non-experts of the Semantic Web.

This creates a tension between a state of the art now very rich in terms of formalisms and methods for modeling vocabularies and a need to democratize and decentralize participation in the life cycle of vocabularies. Also the collective dimension of this participation and the needs for explanations and justifications when vocabularies are reused could benefit from argumentation theory to guide, organize, capture and document the choices of conceptualization and formalization. Finally, identification of primitives constituting these vocabularies can be greatly supported by natural language processing methods again facilitating the contribution and scalability.

In this thesis we will consider the problems and opportunities when coupling collaborative design of formal vocabularies with argumentation theory and language processing, both at the level of knowledge models and algorithms to:
  • support the design and choice of vocabularies based on a state of the art methodologies [1] and their formalization into rules and patterns. By constituting a state of the art and synthesis of methodological approaches we will focus on adaptation to the case of linked data on the Web and in particular the adaptation or extension of standardized formalisms for it.
  • support collaborative editing and the development of design rationale and its arguments. The argumentation theory can be used in this collaborative design phase for structuring needs as graphs. Argumentation theory allows to both identify the reasons for or against a certain choice, and the potential conflicts between the needs [2]
  • support the extraction of vocabularies from unstructured text and semi-structured sources with natural language  processing tools to accelerate contributions and help scale the process. [3]
  1. Examples of methodologies : METHONTOLOGY, DILIGENT, TERMINAE, HCOME, OTK, Ontology Development 101, KACTUS, SENSUS, CO4, KASquare, AKEM, SEKT, OnTo Knowledge(OTK) methodology, etc. c.f. also Fabien Gandon. Ontology Engineering: a Survey and a Return on Experience. RR-4396, INRIA. 2002.
  2. Isabelle Mirbel, Serena Villata: Enhancing Goal-Based Requirements Consistency: An Argumentation-Based Approach. CLIMA 2012: 110-127
  3. Georgios Petasis, Vangelis Karkaletsis, Georgios Paliouras, Anastasia Krithara, and Elias Zavitsanos. 2011. Ontology population and enrichment: state of the art.  In Knowledge-driven multimedia information extraction and ontology evolution, Georgios Paliouras, Constantine D. Spyropoulos, and George Tsatsaronis (Eds.). Springer-Verlag, Berlin, Heidelberg  134-166;  TAC Knowledge Base Population

Scientific objectives:

  • Formalization of rules and patterns supporting collaborative design of vocabularies on the Web data to capture and guide the steps of the life cycle of a vocabulary.
  • Formalization, capture and documentation of networks of arguments justifying the conceptualization and commitments (ontological, computational) for a vocabulary.
  • Algorithms, models and linguistic resources dedicated to the extraction of knowledge for the creation of a formal vocabulary and its formalization on the Semantic Web (RDF / S, SKOS, OWL, rules)


Expected profile:

Applicants must hold a master degree in computer science preferably with skills in knowledge representation and reasoning, natural language processing and semantic Web.


Work Context:

The thesis will take place within the Wimmics team (INRIA, I3S) in Sophia Antipolis, France



This PhD subject is part of the doctoral school EDSTIC funding program and applications are open until the 23rd of May 2016:


Contact :

Fabien Gandon (, Serena Villata (, Elena Cabrio (