Logo del repository
  1. Home
 
Opzioni

On Strings Having the Same Length- k Substrings

Giulia Bernardini
•
Alessio Conte
•
Esteban Gabory
altro
Michelle Sweering
2022
  • conference object

Abstract
Let Substr_k(X) denote the set of length-k substrings of a given string X for a given integer k > 0. We study the following basic string problem, called z-Shortest _k-Equivalent Strings: Given a set _k of n length-k strings and an integer z > 0, list z shortest distinct strings T1,...,T_z such that Substr_k(T_i) = _k, for all i ∈ [1,z]. The z-Shortest _k-Equivalent Strings problem arises naturally as an encoding problem in many real-world applications; e.g., in data privacy, in data compression, and in bioinformatics. The 1-Shortest _k-Equivalent Strings, referred to as Shortest _k-Equivalent String, asks for a shortest string X such that Substr_k(X) = _k. Our main contributions are summarized below: - Given a directed graph G(V,E), the Directed Chinese Postman (DCP) problem asks for a shortest closed walk that visits every edge of G at least once. DCP can be solved in ̃(|E||V|) time using an algorithm for min-cost flow. We show, via a non-trivial reduction, that if Shortest _k-Equivalent String over a binary alphabet has a near-linear-time solution then so does DCP. - We show that the length of a shortest string output by Shortest _k-Equivalent String is in (k+n2). We generalize this bound by showing that the total length of z shortest strings is in (zk+zn2+z2n). We derive these upper bounds by showing (asymptotically tight) bounds on the total length of z shortest Eulerian walks in general directed graphs. - We present an algorithm for solving z-Shortest _k-Equivalent Strings in (nk+n2log2n+zn2log n+|output|) time. If z = 1, the time becomes (nk+n2log2n) by the fact that the size of the input is Θ(nk) and the size of the output is (k+n2).
DOI
10.4230/lipics.cpm.2022.16
Archivio
http://hdl.handle.net/11368/3024591
info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-85134333057
https://drops.dagstuhl.de/opus/volltexte/2022/16140/
Diritti
open access
license:creative commons
license uri:http://creativecommons.org/licenses/by/4.0/
FVG url
https://arts.units.it/bitstream/11368/3024591/2/LIPIcs-CPM-2022-16.pdf
Soggetti
  • string algorithm

  • combinatorics on word...

  • de Bruijn graph

  • Chinese Postman

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback