Utilizing the power of OCR Model in Salesforce

What is Optical Character Recognition (OCR)?

OCR, or Optical Character Recognition, is a process of recognizing text inside images and converting it into an electronic form. These images could be of handwritten text, printed text like documents, receipts, name cards, etc., or even a natural scene photograph.

OCR technology has been applied to revolutionize the document management process. It is a method of digitizing the printed texts that are electronically storable, searchable and used in machine processes such as text to speech, machine translation, and text mining. OCR transforms the scanned images into fully searchable document with text content that can b recognized by the computers.

Einstein Platform Services

The use of AI has become increasingly prominent in business. More than six billion predictions are generated every day by Einstein AI capabilities embedded across Salesforce, putting the power of machine learning at the fingertips of millions of sales, service, marketing and commerce professionals. Einstein enables business users to be smarter and more productive, so they can spend time and resources on what matters most — their customers
Einstein Platform Services bring a wide variety of machine learning capabilities — such as image recognition, sentiment, intent and more — directly into Salesforce’s trusted and secure infrastructure. Using just a single line of code developers can deploy these services into any Salesforce page or build a reusable AI-powered Lightning Web Component Admins can then deploy to end users with just a few clicks. The other fact is developers don't need any model setup, training or data science skills to work on Einstein OCR.

One of the latest Einstein Platform Services include: Einstein Optical Character Recognition (OCR), which provides OCR (optical character recognition) models that detect alphanumeric text in images.

Einstein Optical Character Recognition (OCR) :

Einstein Optical Character Recognition (OCR) leverages computer vision to analyze documents and extract the relevant information, making repetitive tasks like data entry more efficient. For example, a developer can build OCR into a Lightning Web Component to enable sales reps to easily upload a business card from a prospect and automatically analyze the unstructured data from the image file, translate that data into text and extract information that can then be updated in the corresponding Salesforce record.

You can access the models from a single REST API endpoint. Each model is used for specific use cases, such as business card scanning, product lookup, and digitizing documents and tables.

When you call the API, you send in an image and the JSON response contains various elements based on the value of the task parameter:

·String of alphanumeric characters that the model predicts.

·Confidence (probability) that the detected bounding box contains text.

·XY coordinates for the location of the character string within the image (also called a bounding box).

·For tabular data, the table row and column in which the text is located.

·For business cards, the entity type of the detected text such as ORG, PERSON, and so on.

Let’s see how we can use Business Card scanning capabilities for building smarter apps.

Scanning Business Card using Einstein OCR:

Usually, sales reps end up receiving many business cards from trade shows and conferences, entering this data in system manually isn’t one of the efficient ways. Also, studies have shown that most of the business cards received don’t get captured in Salesforce. With Einstein OCRModel’s capabilities, you can convert the physical business cards data into Salesforce record quickly and accurately. This can significantly increase your overall business flow/use case.

Using Einstein Vision and Language Model Builder app:

If you are not a developer and you still want to get hold of various Einstein Platform offerings or want to try POC you can install the Einstein Vision and Language Model Builder app from Appexchange. This article will give you a jump start for using the app. Additionally you can use documentation provided in the resources section.

Using this app is very simple and straightforward.

1. Install the app from appexchange
2. Configure the Einstein setting using Einstein Admin Page. Read detailed steps at https://quip.com/z6a4AlCUw8n3

3. Select OCR tab

6. Hooray !! Your OCR model’s result is now available in the response panel which can be located on the right side of the page. Additionally you can also analyse the raw response or click on colored section in formatted response tab to see the result.

Note: The tag attribute represents the content type, possible values for which are:

ADDRESS
EMAIL
FAX
HOME_PHONE
MOBILE_PHONE
OFFICE_PHONE
ORG
OTHER
PERSON
PHONE
WEBSITE

API Details:

Endpoint: https://api.einstein.ai/v2/vision/ocr
Method: POST
Request Parameters:

Sample request URL:

curl -X POST -H "Authorization: Bearer <TOKEN>" -F sampleLocation="<CARD_LOCATION>" -F task="contact" -F modelId="OCRModel"

Response Parameters:

Sample response for task="contact"

{
   "task":"contact",
   "probabilities":[
      {
         "probability":0.9989197,
         "label":"Director Operations and Logistics",
         "boundingBox":{
            "minX":53,
            "minY":105,
            "maxX":466,
            "maxY":134
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.9978701,
         "label":"Randy E. Cornwell",
         "boundingBox":{
            "minX":53,
            "minY":73,
            "maxX":297,
            "maxY":103
         },
         "attributes":{
            "tag":"PERSON"
         }
      },
      {
         "probability":0.99883646,
         "label":"S-1050-06 Headquarters Building",
         "boundingBox":{
            "minX":55,
            "minY":384,
            "maxX":467,
            "maxY":413
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.9981871,
         "label":"Houston TX 77042 USA 2881 CityWest Blvd.",
         "boundingBox":{
            "minX":53,
            "minY":416,
            "maxX":350,
            "maxY":472
         },
         "attributes":{
            "tag":"ADDRESS"
         }
      },
      {
         "probability":0.9984518,
         "label":"284.575.2884",
         "boundingBox":{
            "minX":547,
            "minY":384,
            "maxX":820,
            "maxY":412
         },
         "attributes":{
            "tag":"PHONE"
         }
      },
      {
         "probability":0.9984518,
         "label":"phone ",
         "boundingBox":{
            "minX":547,
            "minY":384,
            "maxX":820,
            "maxY":412
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.9973475,
         "label":"281.889.0997",
         "boundingBox":{
            "minX":541,
            "minY":415,
            "maxX":825,
            "maxY":439
         },
         "attributes":{
            "tag":"MOBILE_PHONE"
         }
      },
      {
         "probability":0.9973475,
         "label":"mobile ",
         "boundingBox":{
            "minX":541,
            "minY":415,
            "maxX":825,
            "maxY":439
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.89205074,
         "label":"re.comwell@phillips86.com",
         "boundingBox":{
            "minX":478,
            "minY":444,
            "maxX":782,
            "maxY":476
         },
         "attributes":{
            "tag":"EMAIL"
         }
      },
      {
         "probability":0.992919,
         "label":"Phillips",
         "boundingBox":{
            "minX":692,
            "minY":79,
            "maxX":780,
            "maxY":100
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.90038645,
         "label":"66",
         "boundingBox":{
            "minX":693,
            "minY":113,
            "maxX":777,
            "maxY":162
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.9908977,
         "label":"Phillips66.com",
         "boundingBox":{
            "minX":53,
            "minY":198,
            "maxX":238,
            "maxY":226
         },
         "attributes":{
            "tag":"WEBSITE"
         }
      },
      {
         "probability":0.89205074,
         "label":"phillips86",
         "boundingBox":{
            "minX":630,
            "minY":444,
            "maxX":782,
            "maxY":476
         },
         "attributes":{
            "tag":"ORG"
         }
      }
   ],
   "object":"predictresponse"
}

Important points:

·       Image Orientation — The model handles slight image orientation changes but not above 30–40%. Accuracy is better for images in which the text has a straight vertical orientation.
·       Max File Size — The maximum image file size you can pass to this resource is 5 MB.
·       File Types — The supported image file types are PNG, JPG, and JPEG.
·       Response Sort Order — The detected strings returned in the response are sorted by probability.
·       Max Words Returned — The model returns a maximum of 256 words.
·       Supported Languages — Currently, Einstein OCR supports English only.
·       Handwriting — The model currently doesn’t support handwritten characters.
·       Checkboxes — The model currently doesn’t support checkboxes in any text.

Additional Resources:

Documentation: Einstein Platform Services Developer Guide Detect text in Business Cards

Press Release: Salesforce Introduces New Einstein Services, Empowering Every Admin and Developer to Build Custom AI for Their Business

Appexchange: Einstein Vision and Language Model Builder Configuring Einstein Vision and Language Model Builder Einstein Vision and Language Model Builder User Manual

Trailhead Project: Quick Start: Einstein Image Classification

Salesforce Platform Einstein

Search This Blog