Thomas Hegghammer has published his first r-package, diaR on CRAN.

Arabic text as picture

R interface for the Google Cloud Services 'Document AI API' <https://cloud.google.com/document-ai/> with additional tools for output file parsing and text reconstruction. 'Document AI' is a powerful server-based OCR processor that extracts text and tables from images and pdf files with high accuracy. 'daiR' gives R users programmatic access to this processor and additional tools to handle and visualize the output. See the package website <https://dair.info/> for more information and examples.

Emneord: R, text processing, data science, political science
Publisert 16. juni 2021 07:57 - Sist endret 16. juni 2021 07:59