Logo del repository
  1. Home
 
Opzioni

Regular languages meet prefix sorting

Alanko J.
•
D'Agostino G.
•
Policriti A.
•
Prezza N.
2020
  • conference object

Abstract
Indexing strings via prefix (or suffix) sorting is, arguably, one of the most successful algorithmic techniques developed in the last decades. Can indexing be extended to languages? The main contribution of this paper is to initiate the study of the sub-class of regular languages accepted by an automaton whose states can be prefix-sorted. Starting from the recent notion of Wheeler graph [Gagie et al., TCS 2017]-which extends naturally the concept of prefix sorting to labeled graphs-we investigate the properties of Wheeler languages, that is, regular languages admitting an accepting Wheeler finite automaton. We first characterize this family as the natural extension of regular languages endowed with the co-lexicographic ordering: the sorted prefixes of strings belonging to a Wheeler language are partitioned into a finite number of co-lexicographic intervals, each formed by elements from a single Myhill-Nerode equivalence class. We proceed by proving several results related to Wheeler automata: (i) We show that every Wheeler NFA (WNFA) with n states admits an equivalent Wheeler DFA (WDFA) with at most 2n − 1 − |Σ| states (Σ being the alphabet) that can be computed in O(n3) time. (ii) We describe a quadratic algorithm to prefix-sort a proper superset of the WDFAs, a O(n log n)-time online algorithm to sort acyclic WDFAs, and an optimal linear-time offline algorithm to sort general WDFAs. (iii) We provide a minimization theorem that characterizes the smallest WDFA recognizing the same language of any input WDFA. The corresponding constructive algorithm runs in optimal linear time in the acyclic case, and in O(n log n) time in the general case. (iv) We show how to compute the smallest WDFA equivalent to any acyclic DFA in nearly-optimal time. Our contributions imply new results of independent interest. Contributions (i-iii) provide a new class of NFAs for which the minimization problem can be approximated within a constant factor in polynomial time. Contribution (iv) provides a provably minimum-size solution for the well-studied problem of indexing deterministic-acyclic graphs for linear-time pattern matching queries.
WOS
WOS:000554408100055
Archivio
http://hdl.handle.net/11390/1182128
info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-85084083718
Diritti
metadata only access
Visualizzazioni
1
Data di acquisizione
Apr 19, 2024
Vedi dettagli
google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback