Google Text Scanner

Google Text Scanner

The ML Kit Text Recognition API can recognize text in any Latin-based character set. It can also be used to automate data-entry tasks such as processing credit cards, receipts, and business cards.

iOS Android

Key capabilities

  • Recognize text across Latin-based languages Supports recognizing text using Latin script
  • Analyze structure of text Supports detection of words/elements, lines and paragraphs
  • Identify language of text Identifies the language of the recognized text
  • Small application footprint On Android, the API is offered as an unbundled library through Google Play Services
  • Real-time recognition Can recognize text in real-time on a wide range of devices
Text Recognition v2 is now available in beta. It boosts text recognition accuracy and offers support for Chinese, Devanagari, Japanese and Korean scripts.

Text structure

The Text Recognizer segments text into blocks, lines, and elements. Roughly speaking:

  • a Block is a contiguous set of text lines, such as a paragraph or column,

  • a Line is a contiguous set of words on the same axis, and

  • an Element is a contiguous set of alphanumeric characters ("word") on the same axis in most Latin languages, or a character in others

The image below highlights examples of each of these in descending order. The first highlighted block, in cyan, is a Block of text. The second set of highlighted blocks, in blue, are Lines of text. Finally, the third set of highlighted blocks, in dark blue, are Words.

Google Text Scanner

For all detected blocks, lines and elements, the API returns the bounding boxes, corner points, recognized languages and recognized text.

Example results

Google Text Scanner

Photo: Dietmar Rabich, Wikimedia Commons, "Düsseldorf, Wege der parlamentarischen Demokratie -- 2015 -- 8123", CCBY-SA4.0 Recognized Text Text Wege
der parlamentarischen
Demokratie Blocks (1 block) Block 0 Text Wege der parlamentarischen Demokratie Frame (117.0, 258.0, 190.0, 83.0) Corner Points (117, 270), (301.64, 258.49), (306.05, 329.36), (121.41, 340.86) Recognized Language Code de Lines (3 lines) Line 0 Text Wege der Frame (167.0, 261.0, 91.0, 28.0) Corner Points (167, 267), (255.82, 261.46), (257.19, 283.42), (168.36, 288.95) Recognized Language Code de Elements (2 elements) Element 0 Text Wege Frame (167.0, 263.0, 59.0, 26.0) Corner Points (167, 267), (223.88, 263.45), (225.25, 285.41), (168.36, 288.95)