成都专门做公司网站的公司/短视频关键词优化
作者:wangwei8638
OCR卡证照识别又升级了,营业执照识别支持2019最新版,效果如何先来看看。
一.平台接入
此步骤比较简单,不多阐述。可参照之前文档:
https://ai.baidu.com/forum/topic/show/943028
二.分析接口文档
-
打开接口文档https://ai.baidu.com/docs#/OCR-API/c20eb356
(1)接口描述
识别营业执照,并返回关键字段的值,包括单位名称、类型、法人、地址、有效期、证件编号、社会信用代码等。(2)请求说明
需要用到的信息有:
请求URL:https://aip.baidubce.com/rest/2.0/ocr/v1/business_license
Header格式:Content-Type:application/x-www-form-urlencoded
请求参数:image, 图像数据,base64编码,要求base64编码后大小不超过4M,最短边至少15px,最长边最大4096px,支持jpg/png/bmp格式 。注意:图片需要base64编码、去掉编码头后再进行urlencode。
detect_direction:是否检测图像朝向,默认不检测,即:false。
(3)返回示例
{
"log_id": 490058765,"words_result": {"单位名称": {"location": {"left": 500,"top": 479,"width": 618,"height": 54},"words": "袁氏财团有限公司"},"类型": {"location": {"left": 53,"top": 64,"width": 74,"height": 97},"words": "有限责任公司(自然人独资)"},"法人": {"location": {"left": 938,"top": 557,"width": 94,"height": 46},"words": "袁运筹"},"地址": {"location": {"left": 503,"top": 644,"width": 574,"height": 57},"words": "江苏省南京市中山东路19号"},"有效期": {"location": {"left": 779,"top": 1108,"width": 271,"height": 49},"words": "2015年02月12日"},"证件编号": {"location": {"left": 1219,"top": 357,"width": 466,"height": 39},"words": "苏餐证字(2019)第666602666661号"},"社会信用代码": {"location": {"left": 0,"top": 0,"width": 0,"height": 0},"words": "无"}},"words_result_num": 6
}
2.获取accesstoken
#client_id 为官网获取的AK, client_secret 为官网获取的SK
client_id =【百度云应用的AK】
client_secret =【百度云应用的SK】#获取token
def get_token():
host = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=' + client_id + '&client_secret=' + client_secret
request = urllib.request.Request(host)
request.add_header('Content-Type', 'application/json; charset=UTF-8')
response = urllib.request.urlopen(request)
token_content = response.read()
if token_content:
token_info = json.loads(token_content.decode("utf-8"))
token_key = token_info['access_token']
return token_key
三.识别结果
1.老版营业执照
识别结果:
单位名称:东莞成东建筑有限公司
法人:张蓓
有效期:2059年02月23日
证件编号:221206644777220
社会信用代码:641206408384421
地址:东莞市莞城街道福希家园302
2.新版营业执照
识别结果:
单位名称:甘肃省亿家辉煌商贸有限公司
法人:马小迁
有效期:2039年03月17日
证件编号:无
社会信用代码:91620524MA74QXPX4P
地址:甘肃省天水市武山县滩歌镇上街村七组滩歌镇中心花园斜对面
四.源码共享
# -*- coding: utf-8 -*-#!/usr/bin/env pythonimport urllibimport urllib.parseimport urllib.requestimport base64import json#client_id 为官网获取的AK, client_secret 为官网获取的SKclient_id = '*************'client_secret = '*******************'#获取tokendef get_token():host = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=' + client_id + '&client_secret=' + client_secretrequest = urllib.request.Request(host)request.add_header('Content-Type', 'application/json; charset=UTF-8')response = urllib.request.urlopen(request)token_content = response.read()if token_content:token_info = json.loads(token_content.decode("utf-8"))token_key = token_info['access_token']return token_key# 读取图片def get_file_content(filePath):with open(filePath, 'rb') as fp:return fp.read()#获取营业执照信息def get_license_plate(path):request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/business_license"f = get_file_content(path)access_token=get_token()img = base64.b64encode(f)params = {"image": img}params = urllib.parse.urlencode(params).encode('utf-8')request_url = request_url + "?access_token=" + access_tokenrequest = urllib.request.Request(url=request_url, data=params)request.add_header('Content-Type', 'application/x-www-form-urlencoded')response = urllib.request.urlopen(request)content = response.read()if content:business_licenses = json.loads(content.decode("utf-8"))strover = '识别结果:\n'words_result = business_licenses['words_result']# 单位名称Unit_name = words_result['单位名称']['words']strover += ' 单位名称:\n {} \n '.format(Unit_name)# 法人legal_person = words_result['法人']['words']strover += ' 法人:{} \n '.format(legal_person)# 有效期Term_of_validity = words_result['有效期']['words']strover += ' 有效期:{} \n '.format(Term_of_validity)# 证件编号ID_number = words_result['证件编号']['words']strover += ' 证件编号:{} \n '.format(ID_number)# 社会信用代码Social_Credit_Code = words_result['社会信用代码']['words']strover += ' 社会信用代码:{} \n '.format(Social_Credit_Code)# 地址address = words_result['地址']['words']strover += ' 地址:\n{}\n '.format(address)# print (content)print (strover)return contentelse:return ''image_path='F:\paddle\z4.png'get_license_plate(image_path)
五.意见建议
整体识别准确度很高,通过对返回JSON进行分析,个别地址和经营范围返回的结果不全,如地址有2行文字,只返回第一行,建议完善。