Re-Pair [5] is an effective grammar-based compression scheme achieving strong compression rates in practice. Let n, σ, and d be the text length, alphabet size, and dictionary size of the final grammar, respectively. In their original paper, the authors show how to compute the Re-Pair grammar in expected linear time and 5n + 4σ2 + 4d + √n words of working space on top of the text. In this work, we propose two algorithms improving on the space of their original solution. Our model assumes a memory word of [log2n] bits and a re-writable input text composed by n such words. Our first algorithm runs in expected O(n/ϵ) time and uses (1 + ϵ)n + √n words of space on top of the text for any parameter 0
Space-Efficient Re-Pair Compression / Bille, P.; Gortz, I. L.; Prezza, Nicola. - Data Compression Conference Proceedings, (2017), pp. 171-180. (2017 Data Compression Conference, DCC 2017, Snowbird, Utah, April 4-7, 2017). [10.1109/DCC.2017.24].
Space-Efficient Re-Pair Compression
Prezza N.
2017
Abstract
Re-Pair [5] is an effective grammar-based compression scheme achieving strong compression rates in practice. Let n, σ, and d be the text length, alphabet size, and dictionary size of the final grammar, respectively. In their original paper, the authors show how to compute the Re-Pair grammar in expected linear time and 5n + 4σ2 + 4d + √n words of working space on top of the text. In this work, we propose two algorithms improving on the space of their original solution. Our model assumes a memory word of [log2n] bits and a re-writable input text composed by n such words. Our first algorithm runs in expected O(n/ϵ) time and uses (1 + ϵ)n + √n words of space on top of the text for any parameter 0File | Dimensione | Formato | |
---|---|---|---|
repair.pdf
Open Access
Tipologia:
Documento in Pre-print
Licenza:
DRM (Digital rights management) non definiti
Dimensione
346.71 kB
Formato
Adobe PDF
|
346.71 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.