Hashtags are creative labels used in micro-blogs to characterize the topic of a message/discussion. Regardless of the use for which they were originally intended, hashtags cannot be used as a means to cluster messages with similar content. First, because hashtags are created in a spontaneous and highly dynamic way by users in multiple languages, the same topic can be associated with different hashtags, and conversely, the same hashtag may refer to different topics in different time periods. Second, contrary to common words, hashtag disambiguation is complicated by the fact that no sense catalogs (e.g., Wikipedia or WordNet) are available; and, furthermore, hashtag labels are difficult to analyze, as they often consist of acronyms, concatenated words, and so forth. A common way to determine the meaning of hashtags has been to analyze their context, but, as we have just pointed out, hashtags can have multiple and variable meanings. In this article, we propose a temporal sense clustering algorithm based on the idea that semantically related hashtags have similar and synchronous usage patterns.

Stilo, Giovanni; Velardi, Paola. (2017). Hashtag sense clustering based on temporal similarity. COMPUTATIONAL LINGUISTICS, (ISSN: 1530-9312), 43:1, 181-200. Doi: 10.1162/COLI_a_00277.

Hashtag sense clustering based on temporal similarity

STILO, GIOVANNI
Membro del Collaboration Group
;
2017

Abstract

Hashtags are creative labels used in micro-blogs to characterize the topic of a message/discussion. Regardless of the use for which they were originally intended, hashtags cannot be used as a means to cluster messages with similar content. First, because hashtags are created in a spontaneous and highly dynamic way by users in multiple languages, the same topic can be associated with different hashtags, and conversely, the same hashtag may refer to different topics in different time periods. Second, contrary to common words, hashtag disambiguation is complicated by the fact that no sense catalogs (e.g., Wikipedia or WordNet) are available; and, furthermore, hashtag labels are difficult to analyze, as they often consist of acronyms, concatenated words, and so forth. A common way to determine the meaning of hashtags has been to analyze their context, but, as we have just pointed out, hashtags can have multiple and variable meanings. In this article, we propose a temporal sense clustering algorithm based on the idea that semantically related hashtags have similar and synchronous usage patterns.
2017
Twitter mining; temporal mining; event detection
Stilo, Giovanni; Velardi, Paola. (2017). Hashtag sense clustering based on temporal similarity. COMPUTATIONAL LINGUISTICS, (ISSN: 1530-9312), 43:1, 181-200. Doi: 10.1162/COLI_a_00277.
File in questo prodotto:
File Dimensione Formato  
J17-1005.pdf

Open Access

Tipologia: Versione dell'editore
Licenza: Creative commons
Dimensione 267.45 kB
Formato Adobe PDF
267.45 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/253782
Citazioni
  • Scopus 17
  • ???jsp.display-item.citation.isi??? 13
  • OpenAlex ND
social impact