wxHTTP and HTTP 1.1

用wxWidgets的wxHTTP下载文件时遇到了一些问题,记录在此。

1、wxHTTP目前处于停滞状态,版本相当老(追溯到97年),且只支持HTTP 1.0。找到源文件http.cpp可以看到BuildRequest()里HTTP报头设定为HTTP 1.0。

2、官方示例http://wiki.wxwidgets.org/WxHTTP使用wxStringOutputStream只能适用纯文本HTML,不适用于文件,哪怕是XML文件。下载文件需要用wxFileOutputStream。

3、用GET命令下载脚本产生的动态网页时,HTTP协议头中Content-Length是-1。因为服务器一边产生HTML一边传送,所以无法确定准确的HTML文件大小。这也意味着,SeekI()无法作用于wxHTTPStream,即GET返回的文件流。这点在wxWidgets的源文件里也注释了。因此下载脚本产生的动态网页时,无法让wxHTTPStream使用SeekI()跳过一些字节并下载其余字节。每次都是完整下载。

4、HTTP协议头中的Range参数是实现断点续传的基础。Range头在HTTP 1.1中加入,HTTP 1.0是没有的。所以wxHTTP原生就不支持Range。另外,脚本产生的动态网页不支持Range头下载,即不支持断点续传。即便将其加入请求的HTTP报头中,服务器也会忽略它。

5、想用HTTP 1.1,可以继承wxHTTP类,自己重写GetInputStream()和BuildRequest(),并在BuildRequest()中设定HTTP为1.1(如下),将HTTP请求升级为1.1版。但对HTTP响应的工作要自己做。
buf.Printf(wxT(“%s %s HTTP/1.1\r\n”), request, path.c_str());

6、上述中的工作,其中之一就是:当下载脚本产生的动态网页时,服务器会根据HTTP协议的版本来做不同的响应。HTTP1.0版,服务器照常发送文件,在发送完毕或者出现错误的时候断开连接。HTTP 1.1版,讲究TCP连接的复用。报头中Connection参数设为close时,在发送完毕或者出现错误的时候断开连接;设为keep-alive时,服务器不主动断开连接。此时,脚本产生的动态网页文件被分块发送,报头中含有Transfer-Encoding:chunked。当从wxHTTPStream读取时,第一行是文件块的字符数(不是字节数),第二行起至文件块的字符数是真正的文件块数据(如果数据中包含回车换行,则每个回车换行算两个字符)。然后回车换行重复上面的格式,直至文件结束。再回车换行以0作为文件块结束标识。其后跟一些尾部标示(一般没有,且一般可以忽略),最后再以回车换行结束。

7、如果没有收到长度为0的文件块,则表明传输出现了错误。

8、在计算文件块数据的长度时,要注意编码。UTF-8编码的字符的长度是不定长的。

Advertisements

Howto: Using wxSocket in a secondary thread

Trackback:  http://www.litwindow.com/Knowhow/wxSocket/wxsocket.html

Howto: Using wxSocket in a secondary thread.

If you have tried this, you will have stumbled across the following problem(s):

100% CPU usage by wxWidgets application
All socket operations such as Read/Write time out.
Things work if you put them in the main thread.

Note: This information is valid for wxMSW (Microsoft Windows Platforms).

I have not tested this on Linux or other platforms.

To use wxSockets in a secondary thread, you must

call wxSocketBase::Initialize from the main thread before creating any sockets. The best place to do this is in wxApp::OnInit.
make sure that the main thread is running always. Do not sleep or use semaphores to block the main thread.

Here is a little background: This is an excerpt from my post on wx-users…

for wxWidgetsMSW (which also explains the
odd requirement to call wxSocketBase::Initialize in the main thread):
When Initialize is called for the first time, it creates a hidden
window. Socket operations do not block. Instead they use an
asynchronous model. All internal socket operation will return
immediately. When a write is completed, the OS will send a windows
message to the window created by Initialize.

What happens in your case is:

a) If you don’t call Initialize in the main thread, the window will be
created in the secondary thread, which under wxWidgets has no message
loop. The main message loop will only handle windows created in the
main thread. So the events sent to this window will get lost.

b) If you do call Initialize in the main thread: When you call Write
in a secondary thread, the wxWidgets socket code waits in a busy loop.
It will call write once and then call a ‘getstatus’ method which loops
until an internal flag has been set. The code looks somewhat like:

while (!flag) {
sleep(0); // give up timeslice
}

If you block the main thread with sleep or wxCondition, the event sent
by the operating system cannot be processed and will never reach the
hidden socket sink window and so cannot set the flag.

It is my impression that the socket classes seem to be not very well
integrated into wxWidgets and – at least in part – not very well
written. IMHO busy loops really should be avoided.