IRIS - Institutional Research Information System

We consider the problem of decompressing the Lempel-Ziv 77 representation of a string S of length n using a working space as close as possible to the size z of the input. The folklore solution for the problem runs in O(n) time but requires random access to the whole decompressed text. Another folklore solution is to convert LZ77 into a grammar of size O(z log(n/z)) and then stream S in linear time. In this paper, we show that O(n) time and O(z) working space can be achieved for constant-size alphabets. On general alphabets of size σ, we describe (i) a trade-off achieving O(n log^δ σ) time and O(z log^1-δ σ) space for any 0≤ δ≤ 1, and (ii) a solution achieving O(n) time and O(z log log (n/z)) space. The latter solution, in particular, dominates both folklore algorithms for the problem. Our solutions can, more generally, extract any specified subsequence of S with little overheads on top of the linear running time and working space. As an immediate corollary, we show that our techniques yield improved results for pattern matching problems on LZ77-compressed text.

Decompressing lempel-ziv compressed text / Bille, P.; Berggren Ettienne, M.; Gagie, T.; Li Gortz, I.; Prezza, Nicola. - Data Compression Conference, DCC 2020, (2020), pp. 143-152. (2020 Data Compression Conference, DCC 2020, Snowbird, USA, March 20-27, 2020). [10.1109/DCC47342.2020.00022].

Decompressing lempel-ziv compressed text

Bille P.;Berggren Ettienne M.;Gagie T.;Li Gortz I.;Prezza N.

2020

Abstract

We consider the problem of decompressing the Lempel-Ziv 77 representation of a string S of length n using a working space as close as possible to the size z of the input. The folklore solution for the problem runs in O(n) time but requires random access to the whole decompressed text. Another folklore solution is to convert LZ77 into a grammar of size O(z log(n/z)) and then stream S in linear time. In this paper, we show that O(n) time and O(z) working space can be achieved for constant-size alphabets. On general alphabets of size σ, we describe (i) a trade-off achieving O(n log^δ σ) time and O(z log^1-δ σ) space for any 0≤ δ≤ 1, and (ii) a solution achieving O(n) time and O(z log log (n/z)) space. The latter solution, in particular, dominates both folklore algorithms for the problem. Our solutions can, more generally, extract any specified subsequence of S with little overheads on top of the linear running time and working space. As an immediate corollary, we show that our techniques yield improved results for pattern matching problems on LZ77-compressed text.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del convegno
	
				2020
			
	Codice ISBN
	
				978-1-7281-6457-1
			
	Appare nelle tipologie:
	
				04.1 - Contributo in Atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
LZ77_decompression.pdf Solo gestori archivio Tipologia: Documento in Pre-print Licenza: Tutti i diritti riservati Dimensione 313.83 kB Formato Adobe PDF Visualizza/Apri	313.83 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/195656

Citazioni

3

1

ND

social impact