Using Optical Character Recognition (OCR) to convert digital images of books into editable text for Natural Language Processing analysis