-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
如何更细致的分割英文字符和空字符 以及汉字和数字组成的数量词 #1283
Labels
Comments
CharType.type['一'] = CharType.CT_CNUM;
CharType.type['\r'] = CharType.CT_DELIMITER;
CharType.type['\n'] = CharType.CT_DELIMITER;
List<Term> termList = HanLP.segment(
"3.一位项目经理应该做下列哪一项?(C)\r\n"
);
System.out.println(termList); |
另外,1.7.4版删除 |
非常感谢,我试了一下,发现“一”可以分割开了,但是)\r\n还是无法分开。 |
试试代码库中刚提交的补丁,或者 |
嗯嗯好的,确实可以了。 |
Montinosq
added a commit
to Montinosq/HanLP
that referenced
this issue
Sep 13, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
注意事项
请确认下列注意事项:
版本号
当前最新版本号是:portable-1.7.4
我使用的版本是:portable-1.6.4
我的问题
运行如下程序后,文本中的 “3.一”会识别成一个词,英文字符的右括号和换行符\r\n识别成一个词,如何才能把他们分开呢。
复现问题
步骤
触发代码
期望输出
实际输出
其他信息
The text was updated successfully, but these errors were encountered: