What is Optical Character Recognition (OCR)?
OCR, or Optical Character Recognition, is a process of recognizing text inside images and converting it into an electronic form. These images could be of handwritten text, printed text like documents, receipts, name cards, etc., or even a natural scene photograph.
OCR technology has been applied to revolutionize the document management process. It is a method of digitizing the printed texts that are electronically storable, searchable and used in machine processes such as text to speech, machine translation, and text mining. OCR transforms the scanned images into fully searchable document with text content that can b recognized by the computers.
Einstein Platform Services
The use of AI has become increasingly prominent in business. More than six billion predictions are generated every day by Einstein AI capabilities embedded across Salesforce, putting the power of machine learning at the fingertips of millions of sales, service, marketing and commerce professionals. Einstein enables business users to be smarter and more productive, so they can spend time and resources on what matters most — their customersEinstein Platform Services bring a wide variety of machine learning capabilities — such as image recognition, sentiment, intent and more — directly into Salesforce’s trusted and secure infrastructure. Using just a single line of code developers can deploy these services into any Salesforce page or build a reusable AI-powered Lightning Web Component Admins can then deploy to end users with just a few clicks. The other fact is developers don't need any model setup, training or data science skills to work on Einstein OCR.
One of the latest Einstein Platform Services include: Einstein Optical Character Recognition (OCR), which provides OCR (optical character recognition) models that detect alphanumeric text in images.
Einstein Optical Character Recognition (OCR) :
Einstein Optical Character Recognition (OCR) leverages computer vision to analyze documents and extract the relevant information, making repetitive tasks like data entry more efficient. For example, a developer can build OCR into a Lightning Web Component to enable sales reps to easily upload a business card from a prospect and automatically analyze the unstructured data from the image file, translate that data into text and extract information that can then be updated in the corresponding Salesforce record.
You can access the models from a single REST API endpoint. Each model is used for specific use cases, such as business card scanning, product lookup, and digitizing documents and tables.
When you call the API, you send in an image and the JSON response contains various elements based on the value of the task parameter:
·String of alphanumeric characters that the model predicts.
·Confidence (probability) that the detected bounding box contains text.
·XY coordinates for the location of the character string within the image (also called a bounding box).
·For tabular data, the table row and column in which the text is located.
·For business cards, the entity type of the detected text such as ORG, PERSON, and so on.
Let’s see how we can use Business Card scanning capabilities for building smarter apps.
Scanning Business Card using Einstein OCR:
Usually, sales reps end up receiving many business cards from trade shows and conferences, entering this data in system manually isn’t one of the efficient ways. Also, studies have shown that most of the business cards received don’t get captured in Salesforce. With Einstein OCRModel’s capabilities, you can convert the physical business cards data into Salesforce record quickly and accurately. This can significantly increase your overall business flow/use case.
Using Einstein Vision and Language Model Builder app:
If you are not a developer and you still want to get hold of various Einstein Platform offerings or want to try POC you can install the Einstein Vision and Language Model Builder app from Appexchange. This article will give you a jump start for using the app. Additionally you can use documentation provided in the resources section.
Using this app is very simple and straightforward.
1. Install the app from appexchange
2. Configure the Einstein setting using Einstein Admin Page. Read detailed steps at https://quip.com/z6a4AlCUw8n3
One of the latest Einstein Platform Services include: Einstein Optical Character Recognition (OCR), which provides OCR (optical character recognition) models that detect alphanumeric text in images.
Using this app is very simple and straightforward.
1. Install the app from appexchange
2. Configure the Einstein setting using Einstein Admin Page. Read detailed steps at https://quip.com/z6a4AlCUw8n3
3. Select OCR tab
6. Hooray !! Your OCR model’s result is now available in the response panel which can be located on the right side of the page. Additionally you can also analyse the raw response or click on colored section in formatted response tab to see the result.
Note: The tag attribute represents the content type, possible values for which are:
- ADDRESS
- EMAIL
- FAX
- HOME_PHONE
- MOBILE_PHONE
- OFFICE_PHONE
- ORG
- OTHER
- PERSON
- PHONE
- WEBSITE
API Details:
Sample request URL:curl -X POST -H "Authorization: Bearer <TOKEN>" -F sampleLocation="<CARD_LOCATION>" -F task="contact" -F modelId="OCRModel"
Response Parameters:
Sample response for task="contact"{
"task":"contact",
"probabilities":[
{
"probability":0.9989197,
"label":"Director Operations and Logistics",
"boundingBox":{
"minX":53,
"minY":105,
"maxX":466,
"maxY":134
},
"attributes":{
"tag":"OTHER"
}
},
{
"probability":0.9978701,
"label":"Randy E. Cornwell",
"boundingBox":{
"minX":53,
"minY":73,
"maxX":297,
"maxY":103
},
"attributes":{
"tag":"PERSON"
}
},
{
"probability":0.99883646,
"label":"S-1050-06 Headquarters Building",
"boundingBox":{
"minX":55,
"minY":384,
"maxX":467,
"maxY":413
},
"attributes":{
"tag":"OTHER"
}
},
{
"probability":0.9981871,
"label":"Houston TX 77042 USA 2881 CityWest Blvd.",
"boundingBox":{
"minX":53,
"minY":416,
"maxX":350,
"maxY":472
},
"attributes":{
"tag":"ADDRESS"
}
},
{
"probability":0.9984518,
"label":"284.575.2884",
"boundingBox":{
"minX":547,
"minY":384,
"maxX":820,
"maxY":412
},
"attributes":{
"tag":"PHONE"
}
},
{
"probability":0.9984518,
"label":"phone ",
"boundingBox":{
"minX":547,
"minY":384,
"maxX":820,
"maxY":412
},
"attributes":{
"tag":"OTHER"
}
},
{
"probability":0.9973475,
"label":"281.889.0997",
"boundingBox":{
"minX":541,
"minY":415,
"maxX":825,
"maxY":439
},
"attributes":{
"tag":"MOBILE_PHONE"
}
},
{
"probability":0.9973475,
"label":"mobile ",
"boundingBox":{
"minX":541,
"minY":415,
"maxX":825,
"maxY":439
},
"attributes":{
"tag":"OTHER"
}
},
{
"probability":0.89205074,
"label":"re.comwell@phillips86.com",
"boundingBox":{
"minX":478,
"minY":444,
"maxX":782,
"maxY":476
},
"attributes":{
"tag":"EMAIL"
}
},
{
"probability":0.992919,
"label":"Phillips",
"boundingBox":{
"minX":692,
"minY":79,
"maxX":780,
"maxY":100
},
"attributes":{
"tag":"OTHER"
}
},
{
"probability":0.90038645,
"label":"66",
"boundingBox":{
"minX":693,
"minY":113,
"maxX":777,
"maxY":162
},
"attributes":{
"tag":"OTHER"
}
},
{
"probability":0.9908977,
"label":"Phillips66.com",
"boundingBox":{
"minX":53,
"minY":198,
"maxX":238,
"maxY":226
},
"attributes":{
"tag":"WEBSITE"
}
},
{
"probability":0.89205074,
"label":"phillips86",
"boundingBox":{
"minX":630,
"minY":444,
"maxX":782,
"maxY":476
},
"attributes":{
"tag":"ORG"
}
}
],
"object":"predictresponse"
}
Note: The tag attribute represents the content type, possible values for which are:
curl -X POST -H "Authorization: Bearer <TOKEN>" -F sampleLocation="<CARD_LOCATION>" -F task="contact" -F modelId="OCRModel"
Important points:
Important points:
· Image Orientation — The model handles slight image orientation changes but not above 30–40%. Accuracy is better for images in which the text has a straight vertical orientation.
· Max File Size — The maximum image file size you can pass to this resource is 5 MB.
· File Types — The supported image file types are PNG, JPG, and JPEG.
· Response Sort Order — The detected strings returned in the response are sorted by probability.
· Max Words Returned — The model returns a maximum of 256 words.
· Supported Languages — Currently, Einstein OCR supports English only.
· Handwriting — The model currently doesn’t support handwritten characters.
· Checkboxes — The model currently doesn’t support checkboxes in any text.
· Image Orientation — The model handles slight image orientation changes but not above 30–40%. Accuracy is better for images in which the text has a straight vertical orientation.
· Max File Size — The maximum image file size you can pass to this resource is 5 MB.
· File Types — The supported image file types are PNG, JPG, and JPEG.
· Response Sort Order — The detected strings returned in the response are sorted by probability.
· Max Words Returned — The model returns a maximum of 256 words.
· Supported Languages — Currently, Einstein OCR supports English only.
· Handwriting — The model currently doesn’t support handwritten characters.
· Checkboxes — The model currently doesn’t support checkboxes in any text.
Comments
Post a Comment