Archive

Archive for February, 2009

One HTTPS site per IP address… or may be not?

February 26th, 2009 1 comment

I randomly ran across SNI (aka RFC 4366) tonight. It’s a technology that has been under development since before 2000 that allows the client to tell the server what domain it’s visiting before the server sends the certificate. The history is fascinating!

The situation today is that SNI is not here yet. OpenSSL will support it starting in 0.9.9, but has it as a compile time option (default disabled) as of 0.9.8f. Apache may support in it’s next minor release (2.2.12), or maybe not… at least it’s in their trunk, so it will be released someday. I just installed the SNI patch on my Apache 2.2.11 server, and I’m going to try it out. IIS has no stated plan to support it or not. The other popular servers, like Cherokee, lighthttps, and nginx, support it today.

But, as usual, browser support is the limiting factor:

As usual, Internet Explorer is the limiting factor. You need *Vista* to use SNI, so given that IE6 still has a decent market share, and it’s 8 years old… it’s going to be at least 2017 before we can reliably host multiple HTTPS sites on the same IP address – and who knows about embedded browsers (like those in cell phones and PDAs). Perhaps using one IPv6 address per HTTPS site will be more practical before SNI is widely available… who knows.

Categories: Uncategorized Tags:

Why would a cache include cookies?

February 25th, 2009 No comments

Ehcache’s SimplePageCachingFilter caches cookies. And that baffles me… why would a cache include cookies in it?

I ran into the interesting situation where servlets, interceptors, and all those other Java goodies were writing cookies for purposes like the current browsing user’s identifier so it could track that user on the site and keep track of his shopping cart. The problem, which is obvious in retrospect but was incredibly puzzling at first, was that the cookies that included the user id were being cached, so when a subsequent user hit that page, he got the original requester’s user id, and got all that implied (like his cart).

Since each page is cached separately and at separate times, and there is more than one user on the site, visitors would see their carts changing, items seemingly appearing and disappearing randomly, and other such fun. For example, if Alice happened to hit the home page when its cache was expired, her user id cookie ended up in the home page cache. Then Bob comes along and hits the accessories page when its cache has expired, so his user id cookies ends up in that page’s cache. Finally, Charles visits the home page, and sees Alice’s cart. Then, he goes to the accessories page, and sees Bob’s cart. It’s just an incredibly weird and confusing situation!

I’ve been wracking my brain on the topic of caching cookies – when would it be useful? Cookies, as far as I can imagine (and have experienced), contain only user unique information – so why would you cache them?

To solve this problem, I extended SimplePageCachingFilter and overrode the setCookies method, having it be a no-op. And I filed a bug report with Ehcache.

Apache’s mod_cache will include cookies in its cache too. But, in their documentation, they specifically point out the case of cookies in their example of how to exclude items from the cache. It seems Apache knows including cookies is a bad idea… perhaps they should default to excluded?

Categories: Uncategorized Tags:

One instance at a time with PID file in Bash

February 16th, 2009 5 comments

Often times, I only want a script to run one instance at a time. For example, if the the script is copying files, or rsync’ing between systems, it can be disastrous to have two instances running concurrently, and this situation is definitely possible if you run the script from cron.

I figured out a simple way to make sure only one instance runs at a time, and it has the added benefit that if the script dies midway through, another instance will start – a drawback of just using lock files without a pid.

Without further ado, here’s my script:

#!/bin/bash
pidfile=/var/run/sync.pid
if [ -e $pidfile ]; then
pid=`cat $pidfile`
if kill -0 &>1 > /dev/null $pid; then
echo "Already running"
exit 1
else
rm $pidfile
fi
fi
echo $$ > $pidfile
 
#do your thing here
 
rm $pidfile
Categories: Uncategorized Tags:

HTTP Caching Header Aware Servlet Filter

February 14th, 2009 7 comments

On the project I’m working on, we’re desperately trying to improve performance. One of the approaches taken by my coworkers was to add the SimplePageCachingFilter from Ehcache, so that Ehcache can serve frequently hit pages that aren’t completely dynamic. However, it occurred to me that the SimplePageCachingFilter can be improved by adding support for the HTTP caching headers (namely, ETags, Expires, Last-Modified, and If-Modified-Since). Adding these headers will do two important things:

  1. Allow Apache’s mod_cache to cache Tomcat served pages, so that requests to these pages never even hit Tomcat, which should massively improve performance
  2. Allow browsers to accurately cache, so visitors don’t need to re-request pages after the first visit

Implementing these headers wasn’t terribly difficult – just tedious in that I had to read the relevant HTTP specification.

I sincerely hope that Ehcache picks up this class and adds it to the next version – I imagine that many applications could benefit from this class!

Here’s my class:

/**
*  Copyright 2009 Craig Andrews
*
*  Licensed under the Apache License, Version 2.0 (the "License");
*  you may not use this file except in compliance with the License.
*  You may obtain a copy of the License at
*
*      http://www.apache.org/licenses/LICENSE-2.0
*
*  Unless required by applicable law or agreed to in writing, software
*  distributed under the License is distributed on an "AS IS" BASIS,
*  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
*  See the License for the specific language governing permissions and
*  limitations under the License.
*/
import java.io.IOException;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Collection;
import java.util.Date;
import java.util.Iterator;
import java.util.List;
import java.util.Locale;
import java.util.TimeZone;
import java.util.zip.DataFormatException;
 
import javax.servlet.FilterChain;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
 
import org.apache.commons.lang.StringUtils;
 
import net.sf.ehcache.constructs.web.AlreadyGzippedException;
import net.sf.ehcache.constructs.web.PageInfo;
import net.sf.ehcache.constructs.web.ResponseHeadersNotModifiableException;
import net.sf.ehcache.constructs.web.filter.SimplePageCachingFilter;
 
/*
* Filter than extends {@link SimplePageCachingFilter}, adding support for
* the HTTP cache headers (ETag, Last-Modified, Expires, and If-None-Match.
*/
public class HttpCachingHeadersPageCachingFilter extends
SimplePageCachingFilter {
 
private static final SimpleDateFormat httpDateFormat = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss z", Locale.US);
 
static{
httpDateFormat.setTimeZone(TimeZone.getTimeZone("GMT"));
}
 
public synchronized static String getHttpDate(Date date){
return httpDateFormat.format(date);
}
 
public synchronized static Date getDateFromHttpDate(String date) throws ParseException{
return httpDateFormat.parse(date);
}
 
@SuppressWarnings("unchecked")
@Override
protected PageInfo buildPage(HttpServletRequest request, HttpServletResponse response, FilterChain chain) throws AlreadyGzippedException, Exception {
PageInfo pageInfo = super.buildPage(request, response, chain);
if(pageInfo.isOk()){
//add expires and last-modified headers
Date now = new Date();
 
List<String[]> headers = pageInfo.getHeaders();
 
long ttlSeconds = getTimeToLive();
 
headers.add(new String[]{"Last-Modified", getHttpDate(now)});
headers.add(new String[]{"Expires", getHttpDate(new Date(now.getTime() + ttlSeconds*1000))});
headers.add(new String[]{"Cache-Control","max-age=" + ttlSeconds});
headers.add(new String[]{"ETag", "\"" + Integer.toHexString(java.util.Arrays.hashCode(pageInfo.getUngzippedBody())) + "\""});
}
return pageInfo;
}
 
@Override
protected void writeResponse(HttpServletRequest request, HttpServletResponse response, PageInfo pageInfo) throws IOException, DataFormatException, ResponseHeadersNotModifiableException {
 
final Collection headers = pageInfo.getHeaders();
final int header = 0;
final int value = 1;
for (Iterator iterator = headers.iterator(); iterator.hasNext();) {
final String[] headerPair = (String[]) iterator.next();
if(StringUtils.equals(headerPair[header],"ETag")){
if(StringUtils.equals(headerPair[value],request.getHeader("If-None-Match"))){
response.sendError(HttpServletResponse.SC_NOT_MODIFIED);
// use the same date we sent when we created the ETag the first time through
response.setHeader("Last-Modified", request.getHeader("If-Modified-Since"));
return;
}
break;
}
if(StringUtils.equals(headerPair[header],"Last-Modified")){
try {
String requestIfModifiedSince = request.getHeader("If-Modified-Since");
if(requestIfModifiedSince!=null){
Date requestDate = getDateFromHttpDate(requestIfModifiedSince);
Date pageInfoDate = getDateFromHttpDate(headerPair[value]);
if(requestDate.getTime()>=pageInfoDate.getTime()){
response.sendError(HttpServletResponse.SC_NOT_MODIFIED);
response.setHeader("Last-Modified", request.getHeader("If-Modified-Since"));
return;
}
}
} catch (ParseException e) {
//just ignore this error
}
}
}
 
super.writeResponse(request, response, pageInfo);
}
 
/** Get the time to live for a page, in seconds
* @return time to live in seconds
*/
protected long getTimeToLive(){
if(blockingCache.isDisabled()){
return -1;
}else{
if(blockingCache.isEternal()){
return 60*60*24*365; //one year, in seconds
}else{
return blockingCache.getTimeToLiveSeconds();
}
}
}
}
Categories: Uncategorized Tags: