Logo del repository
  1. Home
 
Opzioni

A Domain Knowledge-based Approach for Automatic Correction of Printed Invoices

SORIO, ENRICO
•
BARTOLI, Alberto
•
DAVANZO, GIORGIO
•
MEDVET, Eric
2012
  • conference object

Abstract
Although OCR technology is now commonplace, character recognition errors are still a problem, in particular, in automated systems for information extraction from printed documents. This paper proposes a method for the automatic detection and correction of OCR errors in an information extraction system. Our algorithm uses domain-knowledge about possible misrecognition of characters to propose corrections; then it exploits knowledge about the type of the extracted information to perform syntactic and semantic checks in order to validate the proposed corrections. We assess our proposal on a real-world, highly challenging dataset composed of nearly 800 values extracted from approximately 100 commercial invoices and we obtained very good results.
Archivio
http://hdl.handle.net/11368/2507941
info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-84867340338
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6285067&contentType=Conference+Publications&refinements%3D4292053679%26sortType%3Dasc_p_Sequence%26filter%3DAND%28p_IS_Number%3A6284715%29
Diritti
metadata only access
Soggetti
  • Document understandin...

  • error correction

  • error detection

  • optical character rec...

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback