hiccLoghicc log by wccHipo日志

Swift 3 分词

toc

Intro

发布Hipo Mac 版本后,花了一天时间升级Hipo iOS代码到Swift 3,然后准备为Hipo iOS 增加spotlight 搜索功能,因此就需要分词作为搜索的关键词。

下面是简单的实现方式,依赖CFStringTokenizer - Core Foundation | Apple Developer Documentation.


extension String {
    func tokenize() -> [String] {
        let word = self
        let tokenize = CFStringTokenizerCreate(kCFAllocatorDefault, word as CFString!, CFRangeMake(0, word.characters.count), kCFStringTokenizerUnitWord, CFLocaleCopyCurrent())
        CFStringTokenizerAdvanceToNextToken(tokenize)
        var range = CFStringTokenizerGetCurrentTokenRange(tokenize)
        var keyWords : [String] = []
        while range.length > 0 {
            let wRange = word.index(word.startIndex, offsetBy: range.location)..<word.index(word.startIndex, offsetBy: range.location + range.length)
            let keyWord = word.substring(with:wRange)
            keyWords.append(keyWord)
            CFStringTokenizerAdvanceToNextToken(tokenize)
            range = CFStringTokenizerGetCurrentTokenRange(tokenize)
        }
        return keyWords

    }
}