The challenges of mapping land cover

Before discussing the issues that we face with land cover mapping, it would be relevant to first ask, why on Earth do we need to map land cover in the first place? Well, imagine you are moving to another city to attend college, moving to a new workplace or perhaps are just visiting a new city. You need to figure out what resources are available in that new place with regards to the environment and quality of life. Is it a green city full of parks, gardens, squares or is it a grey place full of concrete and tarmac? Perhaps you will choose to use Google Maps and have a look at the satellite images, virtually navigating around your new place. And just at that moment, you will be performing a supervised land cover classification. You are gathering geospatial information from one or more satellite images and then - using some training data (since you know how a beautiful garden and a concrete square should look) - you use this knowledge to discriminate between areas and perhaps you will choose to live in the outskirts of your new town close to a pond and a park rather than in downtown full of pavements and buildings.

At the University of Reading, as part of the MELODIES project, we do pretty much the same but using a rigorous scientific approach. MELODIES is all about exploiting diverse sources of Open and Linked data to develop new applications and technologies that will benefit society. One of the services that MELODIES will provide will be to improve emissions inventories using Earth Observation (EO) data. We need to know what the actual land cover is in the UK and then identify the changes to this cover on a yearly basis. Emissions of greenhouse gases can occur due to changes in land use; so the information that we gain from the differences between annual land-cover maps allows us to improve the way these emissions due to land cover change are calculated. At county, country, continent and global scale, it is necessary to measure the impact of land cover transformations. All of this may sound reasonably straightforward, however, it is not as simple as that. Let's take a look to some of the challenges.

Do we need another land cover map?

Yes. If you are not aware of it, there are already a variety of land cover maps for the UK. Some of them are produced using high resolution data, the kind of images in which you can distinguish avenues, crop fields, ponds, etc. but the effort required to produce an accurate land cover map using these data is so huge that it would be impossible - or extremely expensive - to produce one every year. An example of this is the Land Cover Map 2007 produced by our colleagues at the Centre for Ecology & Hydrology. Other maps are produced at global scale, their main goal being to map the main global ecosystems and minimize the errors while doing it. The one produced by Boston University using satelite data using from the MODIS sensor is a good example, it is produced on a yearly basis and its code name is MCD12. This global map is an excellent product - but can we use it for regional studies? Let's take a look at Figure 1 which shows the number of land cover changes from one class or category to another one from 2001 to 2012. Areas in white do not have a single change in 11 years, these are typically areas which have been used for agriculture or grazing. At the other end of the scale, some areas are in red. This is a suspiciously high value, as it's quite unlikely that the type of land cover in these areas of the United Kingdom have really changed 9 to 11 times in 11 years! This tells us that we need a land cover map that is fit for our purpose, that is a map that is produced on an annual basis and which focusses on identifying land use changes.


Clouds, clouds and more clouds

One of the main advantages when using EO data is that we can observe large areas on a regular basis and therefore capture the seasonal changes in vegetation behaviour. We can ascertain whether is it dry and plants have no leaves or when vegetation is blooming and full of greenness. To do this, we measure the amount of sunlight reflected by the land each day - where a lot of green light is reflected into space (and the Earth looks green) we can presume that there are a quite lot of green leaves on the land surface during that day.

But how to measure this reflected light over the UK when - as is often the case - it is overcast, gloomy and rainy for long spells during the year? Figure 2 illustrates this problem, it shows daily images from the MODIS sensor onboard two separate satellites, Terra and Aqua.

Fortunately we can overcome this issue by using a computer model to integrate all these daily observations and derive a single image which represents the average "colour" of the surface during a specific period of time - in other words, a perfect, cloud-free image as in Figure 3. Furthermore, because we are using satellite data, we can see beyond the visible and create images that use part of the electromagnetic spectrum that the human eye cannot sense, and create so-called "false colour" images that help us to identify different vegetation types.

Let's classify, if we can

Now that we have clear-sky images, the challenge is to classify our time series into several land cover classes or categories. There are several ways to do it but an easy approach is to use a machine-learning algorithm. Basically, an algorithm that can analyse at our time series of surface “colour” and intelligently assign land cover categories. In order to train this algorithm we have to be certain of a few pixels and know the specific class to which they belong e.g. grassland, cropland or urban areas. Ideally this information is gathered using ground-truth data, for example, foresters sometimes complete reports for a specific area stating the vegetation and soil types, percent of tress, shrubs and grasses inside the plot, etc. This process is extremely costly and not feasible for large areas. Another approach to collect training data is to use land cover maps derived from higher resolution data, which is what we are using in MELODIES. Using the aforementioned Land Cover Map 2007 produced by CEH we have trained our algorithm using pixels that belong in their entirety to one single class, as shown in Figure 4.

A land cover map

Finally, after using daily satellite images to create our “colour” of the surface time series, training the algorithm and using it to classify each pixel, we are able to produce a land cover map as you can see in Figure 5. It is possible to see two main classes dominating the United Kingdom, the crops on the East side and the grasslands on the West. Can you identify some other patterns, like the big metropolitan areas in grey? And inland water bodies?

We have described here only a few aspects of why it is important to map land cover and what the main challenges are. The following steps are the most challenging for us at MELODIES: we know how to create a land cover map, now we are working towards identifying land use changes, and then investigating the impact of those changes in terms of greenhouse gases emitted to the atmosphere. All this is possible because of Open Data, environmental datasets which available to the public in general and the scientific community.

Next time you see some satellite data, you can say: it is possible to perform a classification using a time series of these data, creating a time series and applying machine-learning algorithms is not that straightforward.


Submitted by Akli Benali (not verified) on Tue, 2015-01-20 16:08
Congrats on the text, it is very well written, easy to read and focus on the key points of land cover mapping. Very interesting to see that you have pixels in the UK that are classified differently a large number of times during the 11 year period. Could it be due to a low representation of those biomes at a global level? That happens in Mediterraean areas that only cover about 3% of the globe, so the trainning data is very limited and thus errors are quite frequent. Best regards and keep on SMDA
Submitted by Gerardo Lopez-S... on Wed, 2015-01-21 14:25
Thank you for your comments. There are several reason associated to the number of times the land cover mapped changed over 11 years. One is, as you mentioned, that some biomes did not have enough high quality training data. The same class might have different behavior at global scale, therefore the training data should represent this variability. Another reason is the complex heterogeneous landscape in the UK, quite patchy and fragmented. This means that at the resolution we are working, where a pixel represents 500m by 500m on the ground there are several land cover types inside this pixel, therefore the "coulour" of the surface time series cannot discriminate this mixing and assigns the most likely class. Within MELODIES we are trying to overcome this issue by mapping land cover proportions within a pixel rather than only assigning a single land cover type.

Add new comment