After retrieving Ontario AQI values from the government web site, we had to

- choose a subset of measurement locations
- turn them into digital timeseries for each measurement location,
- then "fill in the gaps" - i.e. estimate values in between measurement locations
- generate images with the estimate values colored according to a scale

Although the government web site has data for about 27 locations in southern or central Ontario, we used a subset of these. Many stations are clustered around major population centers (e.g. 5 or more in the Toronto area); we chose a single station from each such area. Other stations appeared to have frequent malfunctions leading to long periods of missing data. We omitted some of these. Finally, we chose from the remaining station a group that gave us the best geographical coverage we could.

The Ontario government appears to post 3:00 p.m. daily measurements in its 2 week data summaries. This seems to be based on the fact that this is often the highest measurement during the day. However, there are a fair number of days in the year 2000 data when the peak measurement occurred at 1:00 p.m. or even 8:00 a.m. or 11:00 a.m. Also, some station have missing 3:00 p.m. measurements on certain days, which is treated as "missing data" in the 2 week summaries. However there is often data from 1:00 p.m. samples from such station.

A second complication is that AQI doesn't always measure ground level ozone. For a few stations at a few points in time some other pollutant had a higher level (compared to its standard) than ozone and thus determined the AQI. Such points had to be excluded from the analysis.

So, we settled on a definition of our data points as representing the peak daily ozone-based AQI reading in a location, so long as at least one observation after 8:00 a.m. was available. For days not meeting these criteria a "missing data" value was recorded. The software automatically takes missing data into account when mapping.

We were missing two kinds of data that could be used to do a more scientifically accurate interpolation of data values between measurement locations:

- variability estimate (variance, sigma squared) for each location
- air parcel movement data (wind directions over time)

Lacking these we decided on a simple weighted average estimation method. For any estimation point, a set of weights was derived from a mathematical model, one for each measurement location with data values on that day. The AQI estimate at the estimation point was the weighted average of the measured values. The weights are adjusted so that:

- near a measurement location the estimated values converge to the measured value
- at points equidistant from N location the estimate would be the simple average of the N location values

The mathematical function used emphasizes locality over global estimation. In other words it assumes that close to one measurement location the value measured there is the best estimate. On some generated maps you can see small circular patches surrounding a location, due to this model assumption. So, take such artifacts of the estimation method with a grain of salt!

A map of Ontario was downloaded from the Ontario govt site. Locations of measurement sites were digitized from this map and this set of coordinates used for generating new maps. In each new map a small northern part of the original map is left to check registration.

Maps were generated originally a .bmp files and then converted to JPEG format for this web site. There has been a partial loss of accuracy during this conversion. The .bmp files are exact representations of the estimate data while the JPEG files blur some sharp boundaries.