doc_curation.pdf
Curate and process pdf files.
-
doc_curation.pdf.
ocr
(pdf_path, google_key='/home/vvasuki/gitland/vvasuki-git/sysconf/kunchikA/google/sanskritnlp/service_account_key.json')[source]
OCR some pdf with google drive. Automatically splits into 25 page bits and ocrs them individually.
- Parameters
pdf_path –
- Returns
-
-
doc_curation.pdf.
split_into_small_pdfs
(pdf_path, output_directory=None, start_page=1, end_page=None, small_pdf_pages=25)[source]