I love to write physical notes in a notebook. However, I find it frustrating that I need to manually write up my notes later into digital medium. I have been looking out for a handwriting to OCR solution for years and years.

CRAFT + TrOCR

A solution developed pre-LLM era (Github repo ). Only works when I write really slowly and clearly which I don’t do most of the time. My hand writing is pretty spidery so this doesn’t work for me.

GPT4 + Vision

Inspired by Tiago Forte - take a photo of your notebook page followed by “please digitise these handwritten notes for me”. The result is pretty amazing, especially with my awful spidery writing. It’s still probabilistic generation so quite often if I have written something unusual it will decide to use a word that is more likely.

Big downsides here: I don’t want to share my innermost thoughts and feelings with OpenAI and their commercial offerings (like ChatGPT with Vision) are where they do their data collection from. Contractually at least, they don’t do data collection from their API so it may be worth looking into this as a possible solution.

Todo

  • Try out some other commercial LLM solutions I guess. I’d much prefer something I can self-host though, especially for sensitive notes.
  • I need to test Llava-1.6’s handwriting OCR capabilities. A GGUF formatted version of the model can be found here