Information Retrieval System Using Multiwords Expressions (MWE) as Descriptors

Autores

  • Edson Marchetti da Silva Universidade Federal de Minas Gerais
  • Renato Rocha Souza Fundação Getúlio Vargas

DOI:

https://doi.org/10.4301/s1807-17752012000200002

Palavras-chave:

Extraction of Expressions Multiwords, Measures of Association Statistics, Compared Search, Information Retrieval System, the Document Structure.

Resumo

This paper aims to propose an alternative method for retrieving documents using Multiwords Expressions (MWE) extracted from a document base to be used as descriptors in search of an Information Retrieval System (IRS). In this sense, unlike methods that consider the text as a set of words, bag of words, we propose a method that takes into account the characteristics of the physical structure of the document in the extraction process of MWE. From this set of terms comparing pre-processed using an exhaustive algorithmic technique proposed by the authors with the results obtained for thirteen different measures of association statistics generated by the software Ngram Statistics Package (NSP). To perform this experiment was set up with a corpus of documents in digital format

Downloads

Publicado

2012-08-29

Como Citar

Silva, E. M. da, & Souza, R. R. (2012). Information Retrieval System Using Multiwords Expressions (MWE) as Descriptors. Journal of Information Systems and Technology Management, 9(2), 213–234. https://doi.org/10.4301/s1807-17752012000200002

Edição

Secção

Articles