String attractors: Verification and optimization

Kempa, D.; Policriti, A.; Prezza, Nicola; Rotenberg, E.

doi:10.4230/LIPIcs.ESA.2018.52

String attractors [STOC 2018] are combinatorial objects recently introduced to unify all known dictionary compression techniques in a single theory. A set γ ⊆ [1.n] is a k-attractor for a string S ∈ Σn if and only if every distinct substring of S of length at most k has an occurrence crossing at least one of the positions in γ. Finding the smallest k-attractor is NP-hard for k ≥ 3, but polylogarithmic approximations can be found using reductions from dictionary compressors. It is easy to reduce the k-attractor problem to a set-cover instance where the string's positions are interpreted as sets of substrings. The main result of this paper is a much more powerful reduction based on the truncated suffix tree. Our new characterization of the problem leads to more efficient algorithms for string attractors: we show how to check the validity and minimality of a k-attractor in near-optimal time and how to quickly compute exact solutions. For example, we prove that a minimum 3-attractor can be found in O(n) time when |Σ| ∈ O(3+ϵ√log n) for some constant ϵ > 0, despite the problem being NP-hard for large Σ.

Kempa, D.; Policriti, A.; Prezza, Nicola; Rotenberg, E.. (2018). String attractors: Verification and optimization. In 26th European Symposium on Algorithms, ESA 2018. Leibniz International Proceedings in Informatics, LIPIcs (pp. 1- 13). Isbn: 978-3-95977-081-1. Doi: 10.4230/LIPIcs.ESA.2018.52.

String attractors: Verification and optimization

Kempa D.;Policriti A.;Prezza N.;Rotenberg E.

2018

Abstract

String attractors [STOC 2018] are combinatorial objects recently introduced to unify all known dictionary compression techniques in a single theory. A set γ ⊆ [1.n] is a k-attractor for a string S ∈ Σn if and only if every distinct substring of S of length at most k has an occurrence crossing at least one of the positions in γ. Finding the smallest k-attractor is NP-hard for k ≥ 3, but polylogarithmic approximations can be found using reductions from dictionary compressors. It is easy to reduce the k-attractor problem to a set-cover instance where the string's positions are interpreted as sets of substrings. The main result of this paper is a much more powerful reduction based on the truncated suffix tree. Our new characterization of the problem leads to more efficient algorithms for string attractors: we show how to check the validity and minimality of a k-attractor in near-optimal time and how to quickly compute exact solutions. For example, we prove that a minimum 3-attractor can be found in O(n) time when |Σ| ∈ O(3+ϵ√log n) for some constant ϵ > 0, despite the problem being NP-hard for large Σ.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del convegno
	
				2018
			
	Codice ISBN
	
				978-3-95977-081-1
			
	Parole chiave
	
				Dictionary compression; Set cover; String attractors
			
	Citazione
	
				Kempa, D.; Policriti, A.; Prezza, Nicola; Rotenberg, E.. (2018). String attractors: Verification and optimization. In 26th European Symposium on Algorithms, ESA 2018. Leibniz International Proceedings in Informatics, LIPIcs (pp. 1- 13). Isbn: 978-3-95977-081-1. Doi: 10.4230/LIPIcs.ESA.2018.52.
			
	Appare nelle tipologie:
	
				04.1 - Contributo in Atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
verification.pdf Open Access Tipologia: Versione dell'editore Licenza: Creative commons Dimensione 445.63 kB Formato Adobe PDF Visualizza/Apri	445.63 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/194127

Citazioni

10

ND

ND

IRIS - Institutional Research Information System