On the Entropy of Written Spanish

Date
Authors
Guerrero, Fabio G.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Description
This paper reports on results on the entropy of the Spanish language. They are based on an analysis of natural language for n-word symbols (n = 1 to 18), trigrams, digrams, and characters. The results obtained in this work are based on the analysis of twelve different literary works in Spanish, as well as a 279917 word news file provided by the Spanish press agency EFE. Entropy values are calculated by a direct method using computer processing and the probability law of large numbers. Three samples of artificial Spanish language produced by a first-order model software source are also analyzed and compared with natural Spanish language.
Comment: Submitted to the IEEE Transactions on Information Theory
Keywords
Computer Science - Computation and Language, Computer Science - Information Theory
Citation
Collections