Hashtags are creative labels used in micro-blogs to characterize the topic of a message/discussion. However, since hashtags are created in a spontaneous and highly dynamic way by users using multiple languages, the same topic can be associated to different hashtags and conversely, the same hashtag may imply different topics in different time spans. Contrary to common words, sense clustering for hashtags is complicated by the fact that no sense catalogues are available, like, e.g. Wikipedia or WordNet and furthermore, hashtag labels are often obscure. In this paper we propose a sense clustering algorithm based on temporal mining. First, hashtag time series are converted into strings of symbols using Symbolic Aggregate ApproXimation (SAX), then, hashtags are clustered based on string similarity and temporal co-occurrence. Evaluation is performed on two reference datasets of semantically tagged hashtags. We also perform a complexity evaluation of our algorithm, since efficiency is a crucial performance factor when processing large-scale data streams, such as Twitter.

Stilo, Giovanni; Velardi, Paola. (2014). Temporal Semantics: time-varying hashtag sense clustering. In Knowledge Engineering and Knowledge Management (pp. 563- 578). Springer. Isbn: 978-3-319-13703-2. Doi: 10.1007/978-3-319-13704-9.

Temporal Semantics: time-varying hashtag sense clustering

STILO, GIOVANNI
Membro del Collaboration Group
;
2014

Abstract

Hashtags are creative labels used in micro-blogs to characterize the topic of a message/discussion. However, since hashtags are created in a spontaneous and highly dynamic way by users using multiple languages, the same topic can be associated to different hashtags and conversely, the same hashtag may imply different topics in different time spans. Contrary to common words, sense clustering for hashtags is complicated by the fact that no sense catalogues are available, like, e.g. Wikipedia or WordNet and furthermore, hashtag labels are often obscure. In this paper we propose a sense clustering algorithm based on temporal mining. First, hashtag time series are converted into strings of symbols using Symbolic Aggregate ApproXimation (SAX), then, hashtags are clustered based on string similarity and temporal co-occurrence. Evaluation is performed on two reference datasets of semantically tagged hashtags. We also perform a complexity evaluation of our algorithm, since efficiency is a crucial performance factor when processing large-scale data streams, such as Twitter.
2014
978-3-319-13703-2
TEMPORAL MINING; SEMANTICS; HASHTAGS
Stilo, Giovanni; Velardi, Paola. (2014). Temporal Semantics: time-varying hashtag sense clustering. In Knowledge Engineering and Knowledge Management (pp. 563- 578). Springer. Isbn: 978-3-319-13703-2. Doi: 10.1007/978-3-319-13704-9.
File in questo prodotto:
File Dimensione Formato  
978-3-319-13704-9-tms.pdf

Solo gestori archivio

Tipologia: Versione dell'editore
Licenza: Tutti i diritti riservati
Dimensione 796.95 kB
Formato Adobe PDF
796.95 kB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/253770
Citazioni
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 10
  • OpenAlex ND
social impact