Skip to main content

Utilizing the power of OCR Model in Salesforce

What is Optical Character Recognition (OCR)?
OCR, or Optical Character Recognition, is a process of recognizing text inside images and converting it into an electronic form. These images could be of handwritten text, printed text like documents, receipts, name cards, etc., or even a natural scene photograph.

OCR technology has been applied to revolutionize the document management process. It is a method of digitizing the printed texts that are electronically storable, searchable and used in machine processes such as text to speech, machine translation, and text mining. OCR transforms the scanned images into fully searchable document with text content that can b recognized by the computers.
Einstein Platform Services
The use of AI has become increasingly prominent in business. More than six billion predictions are generated every day by Einstein AI capabilities embedded across Salesforce, putting the power of machine learning at the fingertips of millions of sales, service, marketing and commerce professionals. Einstein enables business users to be smarter and more productive, so they can spend time and resources on what matters most — their customers
Einstein Platform Services bring a wide variety of machine learning capabilities — such as image recognition, sentiment, intent and more — directly into Salesforce’s trusted and secure infrastructure. Using just a single line of code developers can deploy these services into any Salesforce page or build a reusable AI-powered Lightning Web Component Admins can then deploy to end users with just a few clicks. The other fact is developers don't need any model setup, training or data science skills to work on Einstein OCR. 

One of the latest Einstein Platform Services include: Einstein Optical Character Recognition (OCR), which provides OCR (optical character recognition) models that detect alphanumeric text in images.
Einstein Optical Character Recognition (OCR) :
Einstein Optical Character Recognition (OCR) leverages computer vision to analyze documents and extract the relevant information, making repetitive tasks like data entry more efficient. For example, a developer can build OCR into a Lightning Web Component to enable sales reps to easily upload a business card from a prospect and automatically analyze the unstructured data from the image file, translate that data into text and extract information that can then be updated in the corresponding Salesforce record.
You can access the models from a single REST API endpoint. Each model is used for specific use cases, such as business card scanning, product lookup, and digitizing documents and tables.
When you call the API, you send in an image and the JSON response contains various elements based on the value of the task parameter:
·String of alphanumeric characters that the model predicts.
·Confidence (probability) that the detected bounding box contains text.
·XY coordinates for the location of the character string within the image (also called a bounding box).
·For tabular data, the table row and column in which the text is located.
·For business cards, the entity type of the detected text such as ORG, PERSON, and so on.
Let’s see how we can use Business Card scanning capabilities for building smarter apps.
Scanning Business Card using Einstein OCR:
Usually, sales reps end up receiving many business cards from trade shows and conferences, entering this data in system manually isn’t one of the efficient ways. Also, studies have shown that most of the business cards received don’t get captured in Salesforce. With Einstein OCRModel’s capabilities, you can convert the physical business cards data into Salesforce record quickly and accurately. This can significantly increase your overall business flow/use case.
Using Einstein Vision and Language Model Builder app:
If you are not a developer and you still want to get hold of various Einstein Platform offerings or want to try POC you can install the Einstein Vision and Language Model Builder app from Appexchange. This article will give you a jump start for using the app. Additionally you can use documentation provided in the resources section.

Using this app is very simple and straightforward.

1. Install the app from appexchange
2. Configure the Einstein setting using Einstein Admin Page. Read detailed steps at https://quip.com/z6a4AlCUw8n3

3. Select OCR tab


6. Hooray !! Your OCR model’s result is now available in the response panel which can be located on the right side of the page. Additionally you can also analyse the raw response or click on colored section in formatted response tab to see the result.

Note: The tag attribute represents the content type, possible values for which are:
  • ADDRESS
  • EMAIL
  • FAX
  • HOME_PHONE
  • MOBILE_PHONE
  • OFFICE_PHONE
  • ORG
  • OTHER
  • PERSON
  • PHONE
  • WEBSITE
API Details:
Endpoint: https://api.einstein.ai/v2/vision/ocr
Method: POST
Request Parameters:
Sample request URL:
curl -X POST -H "Authorization: Bearer <TOKEN>" -F sampleLocation="<CARD_LOCATION>" -F task="contact" -F modelId="OCRModel" 
Response Parameters:

Sample response for task="contact"
{
   "task":"contact",
   "probabilities":[
      {
         "probability":0.9989197,
         "label":"Director Operations and Logistics",
         "boundingBox":{
            "minX":53,
            "minY":105,
            "maxX":466,
            "maxY":134
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.9978701,
         "label":"Randy E. Cornwell",
         "boundingBox":{
            "minX":53,
            "minY":73,
            "maxX":297,
            "maxY":103
         },
         "attributes":{
            "tag":"PERSON"
         }
      },
      {
         "probability":0.99883646,
         "label":"S-1050-06 Headquarters Building",
         "boundingBox":{
            "minX":55,
            "minY":384,
            "maxX":467,
            "maxY":413
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.9981871,
         "label":"Houston TX 77042 USA 2881 CityWest Blvd.",
         "boundingBox":{
            "minX":53,
            "minY":416,
            "maxX":350,
            "maxY":472
         },
         "attributes":{
            "tag":"ADDRESS"
         }
      },
      {
         "probability":0.9984518,
         "label":"284.575.2884",
         "boundingBox":{
            "minX":547,
            "minY":384,
            "maxX":820,
            "maxY":412
         },
         "attributes":{
            "tag":"PHONE"
         }
      },
      {
         "probability":0.9984518,
         "label":"phone ",
         "boundingBox":{
            "minX":547,
            "minY":384,
            "maxX":820,
            "maxY":412
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.9973475,
         "label":"281.889.0997",
         "boundingBox":{
            "minX":541,
            "minY":415,
            "maxX":825,
            "maxY":439
         },
         "attributes":{
            "tag":"MOBILE_PHONE"
         }
      },
      {
         "probability":0.9973475,
         "label":"mobile ",
         "boundingBox":{
            "minX":541,
            "minY":415,
            "maxX":825,
            "maxY":439
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.89205074,
         "label":"re.comwell@phillips86.com",
         "boundingBox":{
            "minX":478,
            "minY":444,
            "maxX":782,
            "maxY":476
         },
         "attributes":{
            "tag":"EMAIL"
         }
      },
      {
         "probability":0.992919,
         "label":"Phillips",
         "boundingBox":{
            "minX":692,
            "minY":79,
            "maxX":780,
            "maxY":100
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.90038645,
         "label":"66",
         "boundingBox":{
            "minX":693,
            "minY":113,
            "maxX":777,
            "maxY":162
         },
         "attributes":{
            "tag":"OTHER"
         }
      },
      {
         "probability":0.9908977,
         "label":"Phillips66.com",
         "boundingBox":{
            "minX":53,
            "minY":198,
            "maxX":238,
            "maxY":226
         },
         "attributes":{
            "tag":"WEBSITE"
         }
      },
      {
         "probability":0.89205074,
         "label":"phillips86",
         "boundingBox":{
            "minX":630,
            "minY":444,
            "maxX":782,
            "maxY":476
         },
         "attributes":{
            "tag":"ORG"
         }
      }
   ],
   "object":"predictresponse"
}



Important points:

·       Image Orientation — The model handles slight image orientation changes but not above 30–40%. Accuracy is better for images in which the text has a straight vertical orientation.

·       Max File Size — The maximum image file size you can pass to this resource is 5 MB.

·       File Types — The supported image file types are PNG, JPG, and JPEG.

·       Response Sort Order — The detected strings returned in the response are sorted by probability.

·       Max Words Returned — The model returns a maximum of 256 words.

·       Supported Languages — Currently, Einstein OCR supports English only.

·       Handwriting — The model currently doesn’t support handwritten characters.

·       Checkboxes — The model currently doesn’t support checkboxes in any text.


Additional Resources:

Comments

Popular posts from this blog

Salesforce Einstein -Named Entity Recognition

What is Einstein NER(Beta) ? NER, or Named Entity Recognition, is a process of recognizing/extracting/classifying named entities from simple text into meaningful categories like person, organization, email, phone etc. The input to the NER model is unstructured text, and after processing it identifies entities within that text.  Einstein Platform Services Einstein Platform Services brings a wide variety of machine learning capabilities — such as image recognition, sentiment, intent and more — directly into Salesforce’s trusted and secure infrastructure. You can simply deploy these services using a single line of code into any Salesforce page or build a reusable AI-powered Lightning Web Component. Once the details are worked upon, admins can easily deploy to end-users with just a few clicks. An interesting fact about this is that you wouldn't need any model setup, training or data science skills to work on Einstein NER. Named Entity Recognition(Beta) is new addition to Sale...

Feedback Loops: Optimizing Image Classifiers in 3 Steps

Salesforce Einstein Vision gives you a set of powerful API's that can be used to bring image recognition capabilities into your CRM application.  Image recognition can be used for Visual search, Brand detection, Product identification and many more. With Salesforce Einstein Vision, you can leverage pre-trained image classifiers, or train your own custom classifiers to solve a vast array of image-recognition use cases, empowering end users across sales, service, and marketing to be smarter and more predictive. What is Feedback loop? A feedback loop refers to the process by which an AI model’s predicted outputs are reused to train new versions of the model. Salesforce Einstein Vision Platform provides set of powerful APIs for  adding feedback loop which improves the image classifier's prediction capabilities by using the model's misclassified images to train new versions of the model. Why we need Feedback loop? In today's world every industry uses feedback loo...

Optimising an AI Model with the Confusion Matrix

What is Einstein Platform Services? Einstein Platform Services allows developers to develop AI-powered custom models which can be used to solve an array of real-life use-cases. It empowers developers with a powerful set of Application Programming Interfaces (APIs) for building intelligent and predictive app experiences. Today, Einstein Platform Services include: Einstein Image Classification (Einstein Vision) Einstein Object Detection (Einstein Vision) Einstein Intent (Einstein Language) Einstein Sentiment (Einstein Language) These services enable developers to create custom deep learning models from unstructured text and image data, which can be trained, refined and used for building intelligent apps. Create, Train and predict Using Einstein Platform Services is a simple 3 steps process, you need to create and upload dataset like text and image data, which can be used for training a model. Once the dataset is uploaded it can be used to train the model. This traine...