Vision框架是苹果提供的计算机视觉工具,用于图像和视频处理,支持任务如文本识别、目标检测、人脸识别等。
VNRecognizeTextRequest是其文字识别功能的核心方法,可从图片或视频中提取文本,支持多语言和高精度模式。
示例用法:扫描到的文本并直接显示在页面上
```
import UIKit
import Vision
class ViewController: UIViewController, UIImagePickerControllerDelegate, UINavigationControllerDelegate {
// 用于显示识别到的文本
private let resultTextView: UITextView = {
let textView = UITextView()
textView.isEditable = false
textView.font = UIFont.systemFont(ofSize: 16)
textView.textColor = .black
textView.backgroundColor = UIColor(white: 0.95, alpha: 1)
textView.translatesAutoresizingMaskIntoConstraints = false
return textView
}()
override func viewDidLoad() {
super.viewDidLoad()
view.backgroundColor = .white
// 添加按钮选择图片
let selectImageButton = UIButton(type: .system)
selectImageButton.setTitle("选择图片", for: .normal)
selectImageButton.addTarget(self, action: #selector(selectImageTapped), for: .touchUpInside)
selectImageButton.frame = CGRect(x: 0, y: 0, width: 200, height: 50)
selectImageButton.center = CGPoint(x: view.center.x, y: 100)
view.addSubview(selectImageButton)
// 添加结果文本框
view.addSubview(resultTextView)
NSLayoutConstraint.activate([
resultTextView.topAnchor.constraint(equalTo: selectImageButton.bottomAnchor, constant: 20),
resultTextView.leadingAnchor.constraint(equalTo: view.leadingAnchor, constant: 20),
resultTextView.trailingAnchor.constraint(equalTo: view.trailingAnchor, constant: -20),
resultTextView.bottomAnchor.constraint(equalTo: view.bottomAnchor, constant: -20)
])
}
@objc func selectImageTapped() {
let picker = UIImagePickerController()
picker.delegate = self
picker.sourceType = .photoLibrary
present(picker, animated: true)
}
func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [UIImagePickerController.InfoKey : Any]) {
picker.dismiss(animated: true)
// 获取选择的图像
guard let image = info[.originalImage] as? UIImage else { return }
// 进行文本识别
recognizeText(from: image)
}
func recognizeText(from image: UIImage) {
guard let cgImage = image.cgImage else { return }
// 创建文本识别请求
let request = VNRecognizeTextRequest { (request, error) in
if let error = error {
print("文本识别出错: \(error)")
return
}
self.handleDetectionResults(request.results)
}
// 配置请求
request.recognitionLevel = .accurate
request.recognitionLanguages = ["en-US", "zh-Hans"] // 支持中文和英文
request.usesLanguageCorrection = false
// 创建请求处理器
let requestHandler = VNImageRequestHandler(cgImage: cgImage, options: [:])
DispatchQueue.global(qos: .userInitiated).async {
do {
try requestHandler.perform([request])
} catch {
print("图像请求处理失败: \(error)")
}
}
}
func handleDetectionResults(_ results: [Any]?) {
guard let results = results as? [VNRecognizedTextObservation] else { return }
// 将所有识别的文本拼接
var detectedText = ""
for observation in results {
if let topCandidate = observation.topCandidates(1).first {
detectedText += topCandidate.string + "\n"
}
}
// 在主线程更新UI
DispatchQueue.main.async {
self.resultTextView.text = detectedText.isEmpty ? "未检测到文本" : detectedText
}
}
}
}```
代码解析
UITextView:
用于展示识别到的所有文本。
自动换行和滚动,适合展示较长的文本内容。
VNRecognizeTextRequest:
Vision 的文本识别请求,用于提取图像中的文本内容。
支持多语言:
配置 recognitionLanguages 为 ["en-US", "zh-Hans"] 支持中文和英文混合文本识别。