CANeo is a software tool adapted to the new Information Society for neologisms classification. It can help information professionals, such as journalists, books authors, magazine writers, and lexicographer focused in the spanish language. The aim of this tool is to detect the root of a neologism and recreate the primitive word it comes from. That primitive word is the main element to proceed with the analysis and it is verified for each possible result by making use of the Lematization service. This service provides us with the grammatical category of the primitive word, allowing to estimate the neologism grammatical categories.
Results provided by CANeo are as follows:
CANeo TIP is able to detect neologisms formed by suffixation and prefixation satisfying the Spanish language rules. Thus we reconstruct a collection of potential primitive words that may result in the proposed neologisms. To classify neologisms is needed to obtain the primitive’s grammatical categories. To do this CANeo TIP uses an external Lemmatization Service. CANeo TIP performs queries to this system in order to get information about the primitives and its grammatical categories and some other valuable details. Combining statistic information of affixes and the primitive’s grammatical categories, it is possible to make an estimation and classification of the neologisms.
This software is a web application written in C# and ASP.NET, using the Microsoft.NET MVC pattern (Framework 3.5.). This application uses XML data sources to store the information involved in the analysis processes. This solution implements several design patterns and methods to speed up the analysis process and to optimize the number of queries between the systems. This application has been included at T I P – Text & Information Processing application’s pool and depends on the Lemmatization service to operate.
This application is based on the studio of seventy thousand composed, including very valuable information about the most productive affixes of the Spanish language, including affixes meanings, statistics and specific information to help us to classify the neologisms and to properly estimate its grammatical category and relevance. The software core includes a rule engine to detect, isolate and reconstruct primitives as follows:
CANeo Tip is the Graduation thesis of Raúl Jiménez Estupiñán in Computer Engineering. This project was directed by Francisco Javier Carreras Riudavets and it was provided with the participation in the development of the libraries and syllabification by Zenón Hernández Figueroa y Gustavo Rodríguez Rodríguez.
Carreras-Riudavets, F.; Jiménez-Estupiñán, R.; Hernández-Figueroa, Z.; Rodríguez-Rodríguez, G. (2012). Catalogador automático de neologismos sufijales y prefijales - CANeo TIP. Available at https://tulengua.es