Tag Archives: http

Debugging Java proxy settings

Java proxy settings are highly annoying. On Windows and Mac, Java does use the system proxy settings, but it doesn’t necessarily understand the same syntax as say IE or Safari/WebKit for the proxy exceptions. The end result is that you keep guessing why certain services deep inside your huge web application keep failing in mysterious ways.  On other platforms, you have to configure the proxy through system properties. Restarting a web application to test out configuration changes can take a long time.

While solving the problem might not be easy, at least I can help with debugging it, with a very simple one-liner: Java Proxy Check (GitHub, direct download link).

For java.net.URL and most other HTTP client code, Java 7 uses a class that can decide on a per-URL basis which proxy should be used: java.net.ProxySelector. javaproxycheck.jar can be run  from the command line to quickly test one or more URLs and see which proxy is selected for that URL:

$ java -Dhttps.proxyHost=proxy.example.com -Dhttps.proxyPort=3128 -jar javaproxycheck.jar http://www.example.com https://secure.example.com
http://www.example.com                   [DIRECT]
https://secure.example.com               [HTTP @ proxy.example.com:3128]

On Mac OS X and Windows, Java uses the system proxy configuration; on other systems, it optionally can use Gnome settings, but by default relies on system properties being set at startup, as documented in Networking Properties, Proxies.

Download from Tivo fails with HTTP error 400

So tonight I was wondering why no new shows had been downloaded from my Tivo to my file server. Investigating my home-grown script’s log output, I  could see that all download attempts were met with an HTTP error 400 since Sunday. (My script works similarly to kmttg.)

I immediately tried in Safari, and had no problems downloading one of these shows. So more debugging was necessary. Enabling additional output for Python’s urllib2 (used for downloading the XML table of contents) and curl (used for downloading the media files) showed an unusual Set-cookie response header on all of the requests, setting a new value for the “sid” cookie. Quick recap: when you load the XML TOC over HTTPS, the Tivo sets a “sid” cookie which you need to supply to download the actual show’s file. No problem, my script was recording the cookie using cookielib.CookieJar, passing the resulting file to curl. Or so I thought. Inspecting the cookie jar I found it to be empty. How could that be?

At this point I turned to Firefox and the Live HTTP Headers extension. Interestingly, the problem was the same in Firefox, so it wasn’t a problem in my script. but was was going wrong? After some experimentation with variations of the Set-cookie header (using my internal web server and Apache’s very useful mod_asis), plus remembering the details of the cookie protocol, it became clear:

Set-Cookie: sid=542329DF12A3586B; path=/; expires="Saturday, 16-Feb-2013 00:00:00 GMT";

If a cookie has an expires attribute, and the current date and time is later than the date specified, the cookie becomes invalid, and the HTTP client should remove it and not transmit it anymore. If a server sends a Set-Cookie header with an expires attribute in the past, the cookie should be deleted. Bingo! (Safari  had an old cookie still set, so the Tivo just used that and did not remove the cookie.)

So the Tivo is apparently trying to set a persistent cookie (confusingly named “sid” as it’s not a session cookie), but doing so with a date in the past. And everything was working just fine until last Saturday! So either Tivo pushed a software update on Sunday, changing the expires date of that cookie, or it has had that date for ever, and eternity just ran out. I haven’t reverse engineered the firmware, but I would bet that someone way back when Tivo got started hard-coded the date “way in the future”, and everbody promptly forgot about it.

No word from Tivo yet, but this Tivo forum thread mentions the same broken cookie.

I managed to add a hack to my script that extracts the “sid” from the Set-Cookie header manually and hand the proper cookie value to curl for the downloads. And I got to learn about mechanize for Python!