Library for extracting plain text from documents(files) for further processing (indexing and searching)