Eigenvector space model to capture features of documents

Choi DONGJIN; Kim PANKOO

Eigenvector space model to capture features of documents

Authors

Choi DONGJIN Department of Computer Engineering Chosun University Gwangju, South Korea
Kim PANKOO Department of Computer Engineering Chosun University Gwangju, South Korea

Keywords:

eigenvector, Vector Space Model, Natural Language Processing, document analyzing, Information Retrieval, text mining

Abstract

Eigenvectors are a special set of vectors associated with a linear system of equations. Because of the special property of eigenvector, it has been used a lot for computer vision area. When the eigenvector is applied to information retrieval field, it is possible to obtain properties of documents data corpus. To capture properties of given documents, this paper conducted simple experiments to prove the eigenvector is also possible to use in document analysis. For the experiment, we use short abstract document of Wikipedia provided by DBpedia as a document corpus. To build an original square matrix, the most popular method named tf-idf measurement will be used. After calculating the eigenvectors of original matrix, each vector will be plotted into 3D graph to find what the eigenvector means in document processing.

References

William S. Nobel, What is a support vector machine?, 1565-1567 (Nature Biotechnology, 2006), 24.

Thomas K. Landauer, Peter W. Foltz, and Darrell Lahm, An Introduction to Latent Semantic Analysis, 259-284 (Discourse Processes, 1998), 25.

Matthew A. Turk and Alex P. Pentland, Face recognition using eigenfaces, 586-591 (IEEE Comput. Sco. Press, 1991), 3.

Amy N. Langville and Carl D. Meyer, A Survey of Eigenvector Methods for Web Information Retrieval, 135-161, (SIAM, 2005), 47.

Chang-Beom Lee, Min-Soo Kim, Ki-Ho Lee, Guee-Sang Lee, and Hyuk-Ro Park, Document Thematic words Extraction Using Principal Component Analysis, 747-754 (KIISE, 2002), 29.

Downloads

Published

2011-09-30

How to Cite

DONGJIN, C., & PANKOO, K. (2011). Eigenvector space model to capture features of documents. Annals of Spiru Haret University. Economic Series, 11(3), 73–80. Retrieved from https://anale.spiruharet.ro/economics/article/view/1136

Download Citation

Issue

Vol. 11 No. 3 (2011)

Section

ACADEMIA PAPERS

License

The Annals of Spiru Haret University. Economic Series operates under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, granting authors full copyright of their work without restrictions. This licensing framework ensures that the journal’s content can be shared and adapted non-commercially, provided appropriate credit is given and derivative works are distributed under the same terms.

By adhering to these principles, the journal reaffirms its commitment to promoting high-caliber research and supporting the global exchange of economic knowledge.

Eigenvector space model to capture features of documents

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Information

Current Issue