Metadata Compression
One simple alternative to consider, if metadata file size is causing metadata distribution to become excessively slow, might be to compress the metadata before transfering it.
For example, the current InCommon metadata file is 7,139,737 octets in length
If compressed with bzip2 -9 (see http://en.wikipedia.org/wiki/Bzip2 ), the current file drops in size to 1,238,006 octets, just 17.3% of the size of the original file. Everything else being unchanged, I'd therefore expect the file transfer time to be proportionately less.
If compressed with xz -9 (see http://en.wikipedia.org/wiki/Xz ), the current file drops in size still further, to just 1,027,212 octets, just 14.4% of the size of the original file
While manual compression could easily be built into the process of preparing the metadata file, another option to consider might be mod_deflate, as discussed at
http://www.devside.net/articles/apache-performance-tuning , which would enable compression to be negotiated between client and the web server on the fly
5 Comments
Scott Cantor (osu.edu)
As far as I'm aware, we already have gzip enabled on the fly for any clients that support it.
(As well as HTTP caching, of course.)
Joseph St Sauver
Not sure the process is currently working.
Just as a baseline, without requesting compression:
11 seconds...
With compression requested:
Same elapsed time/file size.
Contrast that with what I see, for example, from NOAA. Uncompressed:
Versus:
Note that the file size in the NOAA case did change.
trscavo@internet2.edu
> As far as I'm aware, we already have gzip enabled
> on the fly for any clients that support it.
No, the InC metadata server does not support HTTP compression. We considered this at one point but decided against it because 1) increases in throughput were deemed marginal, and 2) AFAIK no other federation in the world is compressing metadata (which introduces unacceptable risk at the client). One or both of these considerations may have changed in the interim, so we should definitely reconsider.
> (As well as HTTP caching, of course.)
Yes, the InC metadata server supports HTTP Conditional GET.
Joseph St Sauver
Tom mentioned:
> compressing metadata (which introduces unacceptable risk at the client)
Can you talk more about the risk you see in this regard? The metadata would be cryptographically signed, right? So if anything went wrong during the download (whether do to a compression hiccup or something else), the newly downloaded metadata would be flagged as corrupted and would not get used, wouldn't that ameliorate the risk?
Scott then commented, and Tom replied:
>> (As well as HTTP caching, of course.)
> Yes, the InC metadata server supports HTTP Conditional GET.
I'd take it a step beyond that and suggest that it would be interesting to see if using something like Varnish Cache would further improve performance (see https://www.varnish-cache.org/ )
trscavo@internet2.edu
>> compressing metadata (which introduces
>> unacceptable risk at the client)
>
> Can you talk more about the risk you see
> in this regard?
Since no federation supports HTTP Compression, that particular Shibboleth feature has never been exercised in production. I don't want to be the first to do that
>> the InC metadata server supports HTTP Conditional GET.
>
> I'd take it a step beyond that and suggest
> that it would be interesting to see if using
> something like Varnish Cache would further
> improve performance
I don't think Varnish will have any effect on metadata refresh. OTOH, the Federation Info pages might benefit from Varnish but there's no chance the TSG will support it.