This thesis explores the operationalization and challenges to the implementation of the FAIR
(Findable, Accessible, Interoperable, Reusable) data principles in two very different scientific
domains (riverine litter data management; nanopore sequencing data analysis). It responded to a
growing need for structured frameworks for data management in the wake of exponential growth in
scientific data generation, as many earnestly present and operationalized datasets suffer from
fragmentation, no documentation, and are difficult to reuse.
The first project sought to standardize and migrate a large riverine litter dataset from 716 micro
research campaigns, with 12,143 samples, across waterways around the world from a heterogeneous
excel format to a normalized MySQL database. The second project developed an automated Pythonbased tool for validating and analyzing Oxford Nanopore sequencing outputs that improved quality
assessment and metadata extraction from a variety of output file formats.
Both applications showed marked improvements in data utility, despite a series of challenges, which
were data heterogeneity, persistent identifier adoption, and limitations on resources. The riverine
litter database was able to undergo full migration to using more standardized terminology and
hierarchical classification systems enabling cross-continental comparisons. The sequencing analysis
application was able to implement automated quality assessments, context-aware reports, and tiered
metadata extraction activities which shortened the time-to-insight for sequencing run assessments.
Despite the recognition that FAIR principles would require considerable adaptations specific to the
scientific domain, the results showed each time the adaptations were successfully completed the
practical benefits outweighed the adaptations in terms of improved data discoverability, decreased
redundancy, and improved reproducibility. The project delivers economic impact and highlights the
dangers of duplicated, wasted research efforts if FAIR principles are adopted, but stresses the need
for institutional policies, specialized training, and long-lasting supportive technical infrastructure
for further FAIR implementation across science.