Retrieving content based documents by using image processing & NLP through query
Keywords:
Optical Character Recognition (OCR), Natural Language Programming (NLP), Image Processing, Artificial Intelligence, Image acquisition, Pre-processing, Pattern RecognitionAbstract
Initially the problem faced was that the documents were searched or retrieved on the basis of the
textual annotation which were given to it manually such as meta data that is topic, keyword, date, time which would be
easy for computer to understand it and perform retrieval process over annotations. So to overcome such scenario , image
processing & NLP can play an important role so that the desired information can be retrieved from the document itself
rather using the textual annotations and then with help of NLP we can actually match the textual information. Aim of the
system is to retrieve a particular document from database by passing a keyword as a query to it and by comparing the
keyword with the actual content of the document it should be retrieved. Objective is to extract particular document from
database by comparing its content with keywords which passed as query. To do so we are using image processing and
one of the important concept from AI that is Natural Language Processing. Storing paper-based documents converted
into digital image format image of effective solution to preserve content in document. Searching the Document Images
stored in a repository by using time as a queries in one of importance tasks of document retrieval.