Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

支持中文URL和中文静态文件名 #559

Merged
merged 1 commit into from
May 29, 2024
Merged

支持中文URL和中文静态文件名 #559

merged 1 commit into from
May 29, 2024

Conversation

xth
Copy link
Contributor

@xth xth commented May 25, 2024

Bug:
Windows下,对中文路径支持不佳,Http Server返回404
Fixed:
静态文件和目录支持中文
PS:
中国人的软件必须支持中文O(∩_∩)O

stat_interval = 10; // s
expired_time = 60; // s
#ifdef OS_WIN
std::setlocale(LC_CTYPE, "en_US.utf8");
Copy link
Owner

@ithewei ithewei May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看了下,你的主要实现就是调用了std::setlocale(LC_CTYPE, "en_US.utf8"); 这个你也可以自行在外部调用吧,不必要非要在libhv内部去调用,可能不是所有用户都想如此设置。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

大神言之有理。那我改成windows宽字节版本的open调用再发一个PR吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit已修改,使用如下方式支持非英文路径
#ifdef OS_WIN
static std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> conv;
auto wfilepath=conv.from_bytes(filepath);
int fd = _wopen(wfilepath.c_str(), flags);
#else
int fd = open(filepath, flags);
#endif

expired_time = 60; // s
}

int hv_open(char const* filepath,int flags){
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

内部函数加static,{前加空格

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改


int hv_open(char const* filepath,int flags){
#ifdef OS_WIN
static std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> conv;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个需要声明为static吗,多线程安全吗?


int hv_open(char const* filepath,int flags){
#ifdef OS_WIN
static std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> conv;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个需要声明为static吗,多线程安全吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

一开始想着外部已经加了锁了,static一下性能好一点。不过从代码规范和安全角度,是不应该声明为static

int hv_open(char const* filepath,int flags){
#ifdef OS_WIN
static std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> conv;
auto wfilepath=conv.from_bytes(filepath);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=前后加空格

PS:
中国人的软件必须支持中文O(∩_∩)O

Signed-off-by: xth <xth@live.com>
@xth
Copy link
Contributor Author

xth commented May 29, 2024

已按要求一一修改。另外我想问一下
#ifdef OS_WIN
// NOTE: open(dir) return -1 on windows
if (!hv_isdir(filepath)) {
param->error = ERR_OPEN_FILE;
return NULL;
}
#else
这块是不是多余了?因为filepath如果是目录,外部传进来的时候已经自动加上“index.html”转成文件路径了。就是真传进来一个目录,open在这里也没有任何意义。因为这个是FileCache::Open本身就是针对文件的,目录不需要缓存。open目录失败直接判负没有任何问题。

@xth
Copy link
Contributor Author

xth commented May 29, 2024

file_cache_ptr FileCache::Open(const char* filepath, OpenParam* param) {
std::lock_guardstd::mutex locker(mutex_);
此处锁加的范围好像有点大。没有区分读写锁。实际上如果没有写入,多线程读map是安全的。

@ithewei
Copy link
Owner

ithewei commented May 29, 2024

file_cache_ptr FileCache::Open(const char* filepath, OpenParam* param) { std::lock_guardstd::mutex locker(mutex_); 此处锁加的范围好像有点大。没有区分读写锁。实际上如果没有写入,多线程读map是安全的。

是有点大,该处不仅仅是锁cached_files这个map,锁住整个函数,是因为下面有对fc内部成员做修改的操作,为了避免多线程同时修改导致问题

@ithewei ithewei merged commit 6900f97 into ithewei:master May 29, 2024
6 checks passed
@fenghongwu
Copy link

针对这次的代码合并,我提2个缺点
1.这个代码改动里,使用了 std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>>,而这个std::codecvt_utf8_utf16,在后续的c++标准中从c++17开始就标记为准备废弃,自c++26开始就正式彻底移除了
参考链接:https://zh.cppreference.com/w/cpp/locale/codecvt_utf8_utf16

2.这个改动之前,按照微软的常规方案,单字节字符串的编码,默认根据本机地区设置而匹配了一种单字节字符串编码方案,比如我是简体中文,那么他就视为我传入的是gbk编码的单字节字符串,如果我是韩国地区,他会视我传入的是韩文对应的单字节字符串编码。而这种微软的历史约定,可能已经使得历史的接口使用者已经确保传入了的单字节字符串为相关地区的单字节字符串编码。但是本次提交,使得本库在windows环境下,调用hv_open传入的const char*filepath要求调用者必须使用utf8格式对单字节字符串进行编码。这很可能会导致之前的库的使用者在直接升级libhv版本的情况下,莫名的遇到了文件路径无效而突然无法访问的问题。

微软是有相关的api供我们实现【utf8】 【utf16】 【本机默认单字节字符串编码】三者之间进行转换的。
https://learn.microsoft.com/zh-cn/windows/win32/api/stringapiset/nf-stringapiset-widechartomultibyte
https://learn.microsoft.com/zh-cn/windows/win32/api/stringapiset/nf-stringapiset-multibytetowidechar
留意参数中标志可用的下述数值:
CP_ACP | 系统默认的 Windows ANSI 代码页。
CP_UTF8 | UTF-8

另外这个是C语言标准里提供的转换函数https://zh.cppreference.com/w/c/string/multibyte

@fenghongwu
Copy link

综上,建议暂时撤销本次改动。至于如何改,需要再议

@ithewei
Copy link
Owner

ithewei commented Sep 26, 2024

综上,建议暂时撤销本次改动。至于如何改,需要再议

是的,这个废弃声明我也注意到了,建议调研下nginx、apache是怎么处理中文编码问题的,欢迎提PR优化下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants