HTTP Content-Length

Not logged inOpenClonk Forum

Forum Home Help Search Watchlist Register Login

Topic Development / Developer's Corner / HTTP Content-Length

Post

By Günther [de]

Date 2010-10-15 21:56

Currently, the http client implementation in Clonk requires a Content-Length header. To get rid of that unfortunate limitations, C4NetIOTCP::Execute needs to tell C4Network2HTTPClient that read returned 0 (or that wsaEvents.iErrorCode[FD_CLOSE_BIT] was null, if I'm guessing how the windows API works from the Clonk code correctly). If read returned non-null, OnRecv is called, which proceeds to call UnpackPacket, which interprets the HTTP message and processes it if the amount of data read equals the expected amount. Otherwise, OnDisconn is called, which creates an error message if UnpackPacket didn't set the success flag earlier. So there are the two options I see to fix this with more or less impact on other users of C4NetIOTCP:

- Modify OnDisconn to take a failure flag, and make C4Network2HTTPClient::OnDisconn call C4Network2HTTPClient::UnpackPacket. Minimal impact on other users, but I'm not sure how C4Network2HTTPClient::OnDisconn would get the data.

- Have Execute call OnRecv on non-error close, let OnRecv pass a close flag to UnpackPacket on iSize 0. Could affect other users, but I suspect those wouldn't do anything because they would have read all packets in the previous call with the last new data. Would require fewer code changes in C4Network2HTTPClient.

Opinions? The code is in C4Network2Reference.[h,cpp] and C4NetIO.[h,cpp].

By Sven2

Date 2010-10-15 23:12

Isn't Content-Length a required field anyway?

By Günther [de]

Date 2010-10-16 01:49

Not at all.

By PeterW

Date 2010-10-16 12:07

When we wrote a web server for our Communication Systems lecture, it was! :(

By Günther [de]

Date 2010-10-16 18:50

Well, Clonk should be compatible with webservers that don't want to wait with the response until all content is computed. For long lists of running games that method could save some RAM on the masterserver, besides being easier to write.

By PeterW

Date 2010-10-18 16:15

Oh come on, we are talking about 100 kb here, tops. The current league backend doesn't even try to stream it right now. Having it pre-formatted as one continous blob of text would actually be a pretty good idea, especially because then you could start using the compression support I built into the engine on mawic's request (which isn't even in use right now as far as I know - but I might be wrong). In every case getting the actual content length is the least of your worries.

But on the other hand: Yes, more flexibility is generally a good thing. Even though I would argue that it's actually more of an upside for "smaller" league servers. This comes up with no fail every time someone tries to build a new one.

By PeterW

Date 2010-10-16 11:45

Basically, yes, option 2. C4NetIOTCP::Peer::Close should act if there is still data in the buffer and the connection was closed "normally". This would mean calling Peer::OnRecv without data, which would call UnpackPacket. Just pass an "EOF" flag through to all of them, and it should be enough for UnpackPacket to do the right thing.

By Günther [de]

Date 2010-10-16 18:56

While I'm at it,
if (!iBytesToRead) ++iBytesToRead;
appears to be made obsolete by clonk.de SVN revision 10234 ("proper buffer management in C4NetIOTCP::Peer. Should be faster allaround.") If we exit the loop after a full read (iBytesRead == iBytesToRead) we could even save the recv call when the connection was closed, as the loop is only entered when data is ready to be read or the socket closed.

By PeterW

Date 2010-10-18 15:59

Uhm, be *very* careful about changing this stuff. If I remember correctly, that increment was only necessary to make the recv syscall behave on some platform (Mac)? It sure wasn't obvious, or I wouldn't have bugged Sven to put a comment in there. And I'm pretty sure the Peer::OnRecv code wasn't the issue - the iBytesToRead hint from ioctl isn't really relevant there.

I'm also not sure whether trying to be "smart" there with "iBytesRead == iBytesToRead" is a good idea. On Windows we have notifications, and I'm not sure these will get generated without having an EINPROGRESS/EWOULDBLOCK before.

Whatever you do, make sure to test it on all platforms...

By Günther [de]

Date 2010-10-18 16:28

Well, the apparent reason for the ++iBytesToRead; is the Buf.New(iBytesToRead); call that's now gone. (Okay, I didn't actually check whether StdBuf::New works with 0 as parameter, I'm just assuming it doesn't.)

And in theory, read should never return EWOULDBLOCK, because the code just polled the file descriptor. That's obviously an optimistic assumption about the behavior of the operating system, but it should be guaranteed at least on Linux ;-) Regardless, windows has to create the close notification without a read call, or indicate readability with zero data available, because Clonk doesn't read() otherwise.

I'm assuming that we wouldn't want to do a read call with length 0 as the parameter. It should be the correct way to check for EOF, but the classic read loop with a static length doesn't do it, and corner cases tend to have more bugs.

But yeah, the conservative option is to do the read call. It obviously doesn't hurt, and closing the connection is not performance critical for Clonk. It's just that that obnoxious comment taunts me every time I look at that function. The author apparently knew why it was necessary, choose not to tell me, and instead opted to tell me that he wasn't telling. It's insulting.

By PeterW

Date 2010-10-18 16:41 Edited 2010-10-18 16:53

> And in theory, read should never return EWOULDBLOCK, because the code just polled the file descriptor.

Yes, but I am actively provoking one by calling recv again - this is what you want to change, don't you? And the reason is that I'm not sure that I will a proper "readable" notification before I haven't really made sure everything has been read. After all, data could arrive just after the first recv call.

And sorry. I guess I should have replaced that comment by something more informative back then. I am actually not sure anymore whether malloc played a role in this. Hm, maybe a combination: malloc returning a null pointer, then recv throwing up?

By Günther [de]

Date 2010-10-18 17:08

Er, yes. But poll is level-triggered, so having to read everything would be a windows-only thing. epoll, the Linux equivalent to Windows' "events", has both edge- and level-triggered notifications, so perhaps windows also has both?

Looking at the code, New() looks fine, apart from the malloc thing, though a nullpointer for a zero-length allocation should be fine for the buffer, as you couldn't dereference the pointer anyway. (Posix helpfully declares: "If the size of the space requested is 0, the behavior is implementation-defined: the value returned shall be either a null pointer or a unique pointer.", and C89 concurs.) But there's a suspiciously commented-out assertion that would fail on zero-length buffers in getMBufPtr. Sadly, the commit message of the commit that commented out the assertion is simply "fixes and enhancements". In any case, passing a nullpointer to read or recv is bad style.

Attachment: commit-64cd92d - patch so far - untested because a local masterserver didn't want to stop sending Content-Length (8k)

By B_E

Date 2010-10-18 20:09

Eh, content-length.

(Have to see about line endings, I don't know what it's set to here)

Attachment: OpenClonk_rev1886.patch (697B)

By Günther [de]

Date 2010-10-18 20:26

As mentioned on IRC, the issue was that Apache apparently sends content-length if the response fits entirely into a buffer or something. "ob_flush" makes it stop.

By PeterW

Date 2010-10-19 12:22

Uhm, I don't like the magic behavior of Read for iBytes == 0. That's Unix strangeness that I don't think we should replicate. Furthermore, your code calls UnpackPacket even if there is no data left to unpack.

By Günther [de]

Date 2010-10-19 16:26

Well, I like the elegance of encoding the state of the fd in the return value of read/recv. You can just do one call and know everything you need to know.

Anyway, the patch doesn't work yet when Content-Length is missing, so I need to to some debugging. I'll also optimize the unnecessary UnpackPacket away.

By PeterW

Date 2010-10-19 17:27

It would be far more elegant if "select" would already tell me that and the 0 return value would take the much more intuitive meaning of "nothing to receive" instead of -1 with errno == EWOULDBLOCK.

Well, feel free to optimize. I just wanted to give the reason why I proposed to implement it slightly differently...

By Günther [de]

Date 2010-10-19 18:25

Yes, it's more elegant for the blocking case. The main point is that it's a return value of read, though.

Both select and poll are not the best interfaces. There's a reason epoll exists, and I think it has an extra flag for remote close (EPOLLHUP). If it weren't for macosx, we could use that.

By PeterW

Date 2010-10-20 15:10 Edited 2010-10-20 15:14

> Yes, it's more elegant for the blocking case. The main point is that it's a return value of read, though.

Well, still. Closing the connection is an exceptional condition, so I would feel it to be the cleaner design if it used the -1 return + errno interface as well. But I suppose at that point it really becomes a matter of taste.

Speaking of which: Why didn't they just define all the different error conditions to result in unique negative return values? That would be even better from the "everything in the return value" point of view. Ironically, I think that's more or less how it works on Windows.

By Günther [de]

Date 2010-10-20 17:40

It's also the way the Linux system calls work, I think - the libc has to translate that to errnos. Maybe errno was for those functions where there's only one or even no error return value available, and then they went for consistency? I think it's generally agreed that it would have been better to use return values, though. Backwards-compatibility constraints don't just confine Windows :-) Both have to live with strange design decisions of the past, Windows also with some APIs that were designed for single-tasking, one-user-only times, Linux also with not being able to get programmers to use Linux-only APIs. epoll is a lot better than poll/select, but we have to maintain the poll/select code path for macosx anyway, so we don't use the nicer epoll interface. We'd probably use select or a generic poll emulation layer on windows, too, if windows were a niche operating system.

By PeterW

Date 2010-10-20 23:16

The make it even more complicated: As far as I know the "correct" thing to do for Mac OS X would actually be to use the BSD kqueue mechanism... ;)

By Günther [de]

Date 2010-10-21 02:31

Hm. Do those also use some kind of object that stores the fds to watch for in the kernel and signal connection-closed as a separate event? If yes, we might be able to reduce the ifdefs to some API renaming, instead of different application logic. Well, except for the GTK+ interface. But I think we can just use poll with the GTK+ fds and our epoll fd for that.

By PeterW

Date 2010-11-17 22:05

Um, is there a reason to remove Content-Length on the server side? Liberal in what you accept, conservative in what you send and stuff?

By B_E

Date 2010-11-17 22:12 Edited 2010-11-17 22:33

I actually haven't got a clue why I did it. The commit is a few weeks old, and I must have forgotten it. I'll revert it.

Edit:
I probably just wanted a diff to test it.

Topic Development / Developer's Corner / HTTP Content-Length

Post