Scottish Marine and Freshwater Science Volume 5 Number 10: Updating Fisheries Sensitivity Maps in British Waters

The requirement to display sensitive areas relating to the life history of commercially important fish species in British waters is well recognized and has been used by the Oil and Gas and other offshore industries for over thirty years. An update of thes


2 Method

2.1 Fisheries Survey Data

2.1.1 Areas of 0 group Aggregations

In Gibb et al. (2007) nursery areas were defined as areas of habitat which support significantly higher juvenile densities than other areas. In this study the same definition was used for 0 group aggregation areas. To determine these, hauls with aggregations of 0 group fish of selected species were identified from several national and international fisheries surveys, and their distribution modelled using species distribution models, as detailed in the sections below.

The raw data for the trawl surveys used were downloaded from DATRAS, the Database of Trawl Surveys maintained by the International Council for the Exploration of the Seas, ( ICES) ( http://www.ices.dk/marine-data/data-portals/Pages/DATRAS.aspx) in July 2012. Data from the commercial fishing observer trips to gather data on fish discarding carried out by Marine Scotland Science staff between the years of 2005 and 2011, and held in the Scottish Government Fisheries Management Database ( FMD), were also used. Other data sources include the Inshore Surveys carried out from 2001 to 2004 summarised by Gibb et al. (2007).

Data were filtered by month, so that only Quarter 3 and Quarter 4 hauls were considered, as there were no available Age-Length-Keys ( ALKs) for 0 group fish in Quarters 1 and 2 - this is because, to standardise age estimates when ageing fish, there is a convention to use 1 st of January as the birthday for most Atlantic species (Holden & Raitt, 1974). The data sources used are summarized in Table 1 and Table 2, and their haul distribution can be seen in Figure 1, below.

As fish distribution is subject to change, only years >= 2000 were considered for cod and other gadoid species (haddock, whiting, saithe, ling and Norway pout).

Table 1

Summary list of fishery surveys used to collate 0 group fish data. GOV = Grande Ouverture Verticale, BT = Beam Trawl, (+) = See Table 2.

Survey Years Quarters Gear Country Reference Pelagic Gadoid Benthic
International Bottom Trawl Surveys
( IBTS)
North Sea IBTS
NS- IBTS
1991-2012 3, 4 GOV(+) Various ICES (2012a)
Scottish West Coast IBTS
SWC- IBTS
1990-2011 4 GOV Scotland
Irish Ground Fish Survey
IE- IGFS
2003-2008 3, 4 GOV Ireland
Evaluation Halieutique Ouest Européen
EVHOE
1997-2012 4 GOV France
Beam Trawl Surveys ( BTS)
BTS 1987-2011 3, 4 BT(+) Various ICES (2009) X
BTS-VIIa 1993-2008 3, 4 BT 4m England X
Inshore Surveys
Various chartered fishing vessels 2001 3 Various Scotland Gibb et al. (2007) X Plaice
Alba na Mara 2002-2004 4 BT158 Scotland X
Discard trips
Various fishing vessels 2005-2011 3, 4 Various Scotland Fernandes et al. (2011)

Table 2

Summary of participating countries in the North Sea IBTS and the BTS surveys used to collate 0 group fish data. GOV = Grande Ouverture Verticale, GRT = Granton trawl, DHT = Dutch Herring Trawl, ABD = Aberdeen 18 ft trawl, BT = Beam Trawl.

Survey Quarter 3 Quarter 4
Country Years Gear Country Years Gear
North Sea IBTS Denmark 1998-2011 GOV Denmark 1991-96 GOV
England 1991-2011 GRT (1991)
GOV
England 1991-96 GOV
France 1992-96 GOV France 1995 GOV
Germany 1992, 96-2011 GOV Netherlands 1991-96 GOV
Netherlands 1991-97 GOV Norway 1991-96 GOV
Norway 1999-2011 GOV 2003-04
Scotland 1991-2011 DHT (1991)
ABD (1992-97)
GOV
Sweden 1991-2011 GOV
BTS England 1990-2011 BT 4m England 2010 BT 4m
Germany 2003-2011 BT 7m
Netherlands 1987-2011 BT 8m

Figure 1: Distribution of fishery surveys used to collate juvenile fish data.

Figure 1: Distribution of fishery surveys used to collate juvenile fish data

The cut-off lengths used to identify 0 group fish of the different species are summarised in Table 3. All lengths are "less than" the value shown. These lengths were determined based in the surveys' age-length keys ( ALKs) available and chosen to represent more than 90% of 0 group fish and less than 6% of 1 group fish. The exception to this rule is sprat in Quarter 3 of North Sea IBTS, where length <9.0 cm represents 88.1% of 0 group fish and 6.86% of 1 group fish.

ALKs were only available for some of the species, in some of the surveys. When no information from ALKs was available, information from the closest (in space) survey was used, if existent; if not, the preliminary age-length splits from the literature were used ( ICES, 2012a).

Table 3

Cut-off lengths (cm) used to identify 0-group fish. All lengths are "less than".

Survey NS- IBTS SWC- IBTS IE- IGFS EVHOE BTS/ BTS-VIIa Discards
Age 0-group 0-group 0-group 0-group 0-group 0-group
Quarter 3 4 4 3 4 4 3 4 3 4
Cod 20 23 33 20 33 30 20 33 20 23
Haddock 18 21 21 18 21 21 18 21 18 21
Whiting 17 20 20 17 20 22 17 20 17 20
Norway pout 12 13 14 12 14 14 12 14 12 13
Saithe 22 25 25 22 25 25 - - 22 25
Herring 15.5 16.5 16.5 15.5 16.5 16.5 - - 15.5 16.5
Mackerel 23 26 26 23 26 26 - - 23 26
Horse mackerel 9 15 15 9 15 15 - - 9 15
Sprat 9.0 9.5 9.5 9.0 9.5 9.5 - - 9.0 9.5
Blue whiting 19 19 19 19 19 19 - - 19 19
Plaice 14 16 17 15 17 17 15 17 14 16
Sole 12 13 13 12 13 13 12 13 12 13
Hake 19 19 19 19 19 19 19 19 19 19
Anglerfish 16 16 16 16 16 16 16 16 16 16
Ling 21 21 21 21 21 21 21 21 21 21

In this study areas of habitat which support significantly higher 0 group fish densities than other areas were identified. To achieve this the distribution of 0 group fish aggregations was modelled, instead of the distribution of all 0 group fish. This work uses density threshold to delimit what is considered a presence and what is not before applying the model. Similar approaches have been used in other SDM studies. For example, Howell et al. (2011) compared the distribution of Lophelia pertusa with the distribution of L. pertusa reefs (defined not only for its abundance, but clearly related to this feature) whereas other authors (Moritz et al., 2012; Martín-García et al., 2013) modelled communities defined by differences in species biomass or using a density threshold implicit in their definition ( e.g. brown garden eel).

Aggregations of 0 group fish were identified by sorting in ascending order all hauls where the selected species was present in each survey type, then ranking their abundance and selecting the top quartile of the distribution (≥ 75%). This way catch data was standardised by reducing each haul's catch to presence (≥ 75%) or absence (≤ 75%) of aggregations of 0 group fish for each species, thus reducing the variations introduced by differences in sampling methodology between different surveys, as well as gear and vessel performances and catch variation between and within years.

For gadoids and other demersal species this was done across the entire dataset, as their numbers remained consistent over the years, while for pelagic species the process was repeated on a year by year basis, as numbers by haul can vary by several orders of magnitude from one year to the next. Given the numbers involved and the extreme differences in numbers from year to year, two additional criteria were introduced when identifying aggregations for pelagic species:

a) Fewer than 10 fish per haul was never considered an aggregation, even if in the 75% quartile;
b) More than 500 fish per haul was always considered an aggregation, even if out of the 75% quartile.

For saithe and ling there were not enough hauls with 0 group fish present so that aggregations could be identified. As the juveniles of both these species stay in their inshore nursery habitats until they are 2-3 years of age (Heessen et al., 2006; Rowley, 2008) nursery areas could in the future be modelled using data of age 1 fish.

2.1.2 Herring Small Larvae Aggregation Area

Herring spawning grounds have been defined using a number of data sources, including grab surveys on spawning grounds ( e.g. Parrish et al., 1959; Bowers, 1969), the presence of recently hatched larvae (reviewed in Heath, 1993), the presence of herring eggs in haddock stomachs ( e.g. Bowman, 1922) and the capture of mature adult fish from both commercial boats and surveys ( ICES 2010a). At this stage of the work, data for recently hatched larvae were used and their distribution modelled by species distribution models, as detailed in the sections below.

The herring to the west of the British Isles are currently fished, managed and assessed separately as four stocks: 1) VIa North; 2) VIa South and VIIb,c; 3) Irish Sea (VIIaN) and 4) Celtic Sea and VIIj (ICES, 2010a). Similarly, The North Sea herring stock is also generally understood as representing a complex of multiple spawning components (Cushing, 1955; Harden Jones, 1968; Iles and Sinclair, 1982; Heath et al., 1997). Most authors distinguish four major components, highlighted in Figure 2, each defined by distinct spawning times and sites (Iles and Sinclair, 1982; Corten, 1986; Heath et al., 1997). The Orkney-Shetland component spawns in August/September; the Buchan component to the east of Scotland in September/October; the Banks component off the English coast around the same time; and the Downs component in the English Channel mainly during December. Although the different components mix outside the spawning season and are exploited together, each component is thought to have a high degree of population integrity (Iles and Sinclair, 1982) and, therefore, could be expected to have relatively unique population dynamics (Payne, 2010).

The ICES programme of International Herring Larval Surveys ( IHLS) in the North Sea and adjacent areas has been in operation since 1967. It's main purpose is to provide quantitative estimates of herring larval abundance, which are used as a relative index of changes of the herring spawning stock biomass in the assessment ( ICES 2008).

The larval surveys are carried out in specific time periods and areas, following the autumn and winter spawning activity of herring from north to south ( ICES 2008) and are considered to have been consistent since 1972 (Payne, 2010). Survey data are reported to the ICES International Herring Larvae database annually. This database contains information about the surveys conducted since 1972 and is currently available through the ICES Eggs and Larvae Data Portal ( http://www.ices.dk/marine-data/data-portals/Pages/Eggs-and-larvae.aspx ).

The raw data for the IHLS used in this work were downloaded from this portal in March 2013 and are summarised below in Table 4 (west of Scotland and northwest Ireland) and Table 5 (North Sea). The IHLS haul distribution, highlighting the different sampling regions, can also be seen in Figure 2.

Surveys off northwest Ireland suffered from poor sampling coverage and were discontinued after 1988 (Heath et. al, 1993), as were those off the west of Scotland after 1994. Because of this difference in temporal coverage, and also because herring in the Atlantic and in the North Sea are different stocks (see references above), data east and west of the 4°W meridian were treated and modelled separately.

Figure 2: Distribution of International Herring Larvae Survey ( IHLS) hauls, highlighting sampling regions, from 1972 to 2011.

Figure 2: Distribution of International Herring Larvae Survey (IHLS) hauls, highlighting sampling regions, from 1972 to 2011

Table 4

Summary of participating countries in the IHLS, west of Scotland and northwest Ireland, from 1972 to 1994.

Country Years Season Month Gear
Region: West of Scotland and Northwest Ireland
Germany 1980-89 Autumn Sep GULF III
Ireland 1981-88 Autumn Sep (86-87) GULF III
Oct, Nov
Netherlands 1974, 1980 Autumn Sep (74), Oct (80) GULF III
Norway 1980 Autumn Oct GULF III
Scotland 1972-94 Autumn Aug (74, 78-79) DG III (82-89)
Sep, Oct GULF III (72-81, 90-94)
England 1972-75 Autumn Sep GULF III

Table 5

Summary of participating countries in the IHLS, North Sea, from 1972 to 2011.

Country Years Season Month Gear
Region: Orkney / Shetland
Denmark 1972, 1975 Autumn Sep GULF III
1981-83, 88-89
Germany 1974-77
1979-2011
Autumn Aug (79, 81-82, 85, 88-89) GULF III
Sep, Oct
Netherlands 1977-87 Autumn Aug (77-78, 82, 84)
Sep
TORPEDO (84)
GULF III
2004-06, 08-09
Norway 2000 Autumn Sep GULF III
Scotland 1972-75
1977-89
Autumn Aug (73) DG III (83-87, 89)
GULF III (72-82, 88)
Sep, Oct (74, 77-78)
England 1972-75 Autumn Sep GULF III
Region: Buchan
Denmark 1972, 1974 Autumn Sep GULF III
1981-89 DG III (86)
Germany 1976, 93, 96-97 Autumn Sep GULF III
2000, 02, 07, 09
Netherlands 1978, 82, 88-92, Autumn Sep
Oct (78, 89-90)
GULF III
96, 98-2011
Norway 1979, 81, 2000 Autumn Sep (00), Oct (79, 81) GULF III
Portugal 1977 Autumn Sep, Oct GULF III
Scotland 1972-1989
1993
Autumn Aug (72-73), Sep
Oct (72, 74-75, 93)
DG III (83-87, 89)
GULF III (72-82, 88, 93)
England 1973 Autumn Oct GULF III
Region: Central North Sea / Banks
Denmark 1974 Autumn Aug, Sep GULF III
Germany 1998-99 Autumn Sep (98) GULF III
2003 Oct (99, 03)
Netherlands 1972-73 Autumn Aug (87), Sep,
Oct (72-73, 75-76,
78, 80, 87-95, 02-03)
TORPEDO (84)
1975-96 GULF III (72-03)
1998-2011 GULF VII (04-11)
Norway 1976-81 Autumn Oct GULF III
Portugal 1976 Autumn Oct GULF III
England 1972-89 Autumn Aug (79, 84)
Sep (72-73, 76-81, 84-85, 87)
Oct
20 TTN (81-82)
50 TTN / HSTN (83-84)
53 TTN / HSTN (84-89)
GULF III (72-80)
Region: Southern North Sea / Downs
France 1981-82 Winter Jan, Feb (82) GULF III
Germany 1972-74
1979-2011
Winter Jan GULF III
Netherlands' 1972-2011 Winter Dec
Jan
GULF III (72-03)
GULF VII (04-11)
England 1972-75, 77
1979-1989
Winter Dec (82)
Jan
Feb (73, 80-81, 88)
20 TTN (81-82)
50 TTN / HSTN (83-84)
53 TTN / HSTN (84-89)
GULF III (72-80, 87, 89)

The IHLS is centred upon the estimate of a Larval Abundance Index ( LAI). An annual LAI is calculated on the basis of catches of only the most recently hatched larvae, and in particular larvae <11 mm in length in the Downs region and <10 mm in all the other regions (Heath, 1993; ICES 2008; Payne, 2010). The mean hatching length of larvae is approximately 6.5 mm, and growth rates estimated from field investigations have been approximately 0.2 to 0.3 mm per day. Hence, the LAI includes all larvae up to approximately 10 to 15 days old (Heath, 1993).

Similarly to what was described in Section 2.1.2. above for 0 group fish, IHLS catch data were also standardised by reducing each haul's catch to presence or absence of aggregations of small larvae. Rankine (1986) suggested that densities of >500 larvae per square metre should be used to indicate the main spawning grounds around the Scottish coastal areas. However, because in some of the years (1974 to 1980) the maximum density was well below this level at most of the sampling regions, a threshold of 85% of the distribution density curve was used, in Rankine's work, to represent the core spawning area per year, as proposed in ICES (2008). Therefore, aggregations of small larvae were identified for every year, by sorting in ascending order all hauls where small larvae were present, ranking their abundance and selecting the upper fifteenth percentile of the distribution (≥ 85%) as per Rankine (1986). In this report these larval aggregations are not used as a proxy for spawning areas of herring but simply as a predictive spatial representation of small herring larvae aggregations.

During the Quarter 1 IBTS an international herring larval survey takes place which uses a Methot-Isaacs-Kidd ( MIK) net to survey over-wintering herring larvae. These data have not been used in the model for predicting small larvae aggregations, as given that four to five months may have passed since spawning these fish are mostly larger than 11 mm and are entering the post-larval stage.

Also, due to the available MIK data collation's large spatial resolution - the data points are averaged to a statistical rectangle - these data are incompatible with the environmental layers required for the model to work due to the difference in spatial resolution between these datasets.

This large resolution of the MIK data collation also precludes data from the MIK survey being used in the 0 group aggregation predictions for the same reason.

2.2 Environmental Data

All spatial operations and analysis used to prepare the environmental layers at this point of the work were developed using the ESRI ® ArcGIS application for desktop ArcMap™, version 10.0.

Water depth, in meters, was obtained from the gridded bathymetry dataset GEBCO_08. The GEBCO_08 Grid is a global 30 arc-second grid largely generated by combining quality-controlled ship depth soundings with interpolation between sounding points guided by satellite-derived gravity data. It is developed by the General Bathymetric Chart of the Oceans ( GEBCO) and made available through the British Oceanographic Data Centre ( BODC) online at http://www.bodc.ac.uk/data/online_delivery/gebco/. The version used for this work, version 20100927, was released in September 2010 and downloaded in May 2013.

The GEBCO_08 grid was the spatial layer with the finest resolution available for the present work. Therefore, in order to preserve the best possible resolution in all layers, GEBCO_08 was used as the limiting factor for spatial resolution. At the latitudes of the present study area, a 30 arc-second grid has an average cell size of 780 x 780 m, and so all the other layers were created with, or resampled to, this grid resolution.

The layer slope was derived from the bathymetry grid GEBCO_08, by using the "Slope" tool of the Spatial Analyst package of ESRI ® ArcMap™. Slope is the gradient, or rate of maximum change in z-value, of each cell of a raster surface.

Time series of the environmental layers temperature, salinity, eastward and northward water velocities and concentration of diatoms and flagellates were obtained from the biophysical model NORWECOM (Skogen et al., 1995; Skogen and Søiland, 1998). This model was selected against the other models available - two biophysical, POLCOMS (Holt et al., 2005) and ECOSMO (Schrum et al., 2006) and one climatology, ICES / WODC (Berx and Hughes, 2008) - because it presented a more complete spatio-temporal coverage of the area and timeline analysed, and also offered the finest spatial resolution. Even so, this model does not have a complete cover of inshore areas and in particular it lacks coverage of the sea lochs and the intricate coastline on the west of Scotland. For this reason, it was not possible at this stage of the work to produce model outputs for these areas. For the set-up, validation and the latest information on the NORWECOM simulation refer to Hjøllo et al. (2009) and Skogen and Mathisen (2009).

NORWECOM offers monthly averages for each variable for the time period covering 1970-2012 (1970-2010 for temperature and salinity) and the datasets were extracted from the Institute of Marine Research NORWECOM hindcast download webpage in May 2013 ( http://www.imr.no/~morten/wgoofe/). Several different extractions were carried out for temperature, salinity, eastward and northward water velocities and concentration of diatoms and flagellates intended for different models:

i) Near-bottom values for Quarter 3 and Quarter 4, for the period 2000-2012; this dataset was used to model 0 group aggregation areas of gadoid fish: cod, haddock, whiting, and Norway pout; saithe and ling were not modelled.
ii) Near-bottom values for Quarter 3 and Quarter 4, for the period 1970-2012; this dataset was used to model 0 group aggregation areas of benthic fish: plaice and sole. For hake and anglerfish presences of 0 group fish, not aggregations of 0 group fish were modelled.
iii) Mean-depth values for Quarter 3 and Quarter 4, for the period 1970-2012; this dataset was used to model 0 group aggregation areas of pelagic fish: herring, mackerel, horse mackerel, sprat and blue whiting;
iv) Near-bottom annual values, for the period 1970-2012; this dataset was used to model small larvae aggregation areas of herring.

The seabed sediments data was obtained from the European Marine Observation and Data Network ( EMODNET) Seabed substrate map. The map was collated and harmonised from substrate information within the EMODNET-Geology project ( http://www.emodnet-geology.eu/), with the contribution of more than 200 separate sea-bed substrate maps. In British waters the data was provided by the British Geographical Survey ( BGS) and the existing substrate classifications have been translated to a scheme that is supported by EUNIS, the European Nature Information System. This EMODNET reclassification scheme consists of four substrate classes defined on the basis of the modified Folk triangle (mud to sandy mud; sand to muddy sand; coarse sediment; mixed sediment) and two additional substrate classes (diamicton and rock). In addition, the mixed sediment includes four subcategories: mixed sediment with bimodal grain-size distribution; Glacial clay, Hard bottom complex and Highly patchy seafloor areas. The final version ( EMODNET, 2012) was produced in June 2012 and is available online at http://geomaps2.gtk.fi/ArcGIS/services/EMODnet/MapServer/WMSServer as a polygon feature class. To convert this into a raster layer of the same resolution as the other environmental layers, the Polygon to Raster tool from the Conversion toolbox of ESRI ® ArcMap™ was used, with the MAXIMUM_COMBINED_AREA cell assignment method: if there was more than one feature in a cell with the same value of sediment type, this method combined the areas of these features, and the combined feature with the largest area within the cell determined the value to assign to that cell.

The distance to coast layer was produced by the Euclidean Distance tool of the Spatial Analyst package of ESRI ® ArcMap™. This tool produces a raster output giving the distance from each cell in the raster to the closest source, which in this case was a combination of the layers pan50 for the coastline of Scotland (the 0 m contour from the OS PANORAMA dataset, http://www.ordnancesurvey.co.uk/business-and-government/products/land-form-panorama.html) and britisles for the remaining coastline of the British Isles.

Finally, the same Euclidean Distance tool was used to calculate the layer distance to gravel, this time using the categories Gravel ( GV) and Sandy Gravel ( SDGV) of the BGS's Marine SeaBed Sediment Map - UK Waters - 250k (DigSBS250) as source. This layer, being especially relevant for defining herring spawning grounds (Parrish et al., 1959; Bowers, 1969; Holliday (1958) cited in Rankine, 1986), was only used to model small larvae aggregation areas of herring and left out of the other models.

Because the IHLS surveys are designed so that the hauls take place within a 10 x 10 nautical mile ( NM) grid ( ICES, 2008), there was a need to adapt the resolution of the environmental layers to accommodate this coarser resolution of the survey data for the models of herring spawning areas. The centre of these 10 x 10 NM cells is well-defined as the positions where the samples should be taken. Most hauls take place close to the cell centre, depending on weather conditions, wind stress, presence of oil platforms etc. So there is some kind of flexibility, most often in the order of magnitude of up to one mile deviation, seldom two (Norbert Rohlf, personal communication, 21/08/2013). Therefore a 2 x 2 NM grid was created, in order to allow for a 1 NM radius around the centre of each cell, and all the environmental layers were resampled to this resolution. The resampling was done with the Resample tool of the ArcMap Data Management toolbox, using the cubic convolution algorithm as the resampling technique. This algorithm uses the value of the sixteen nearest input cell centres to determine the value on the output raster. The new value for the output cell is a weighted average of these sixteen values, adjusted to account for their distance from the centre of the output cell.

A summary of all the environmental layers used in the present work can be found in Table 6.

Table 6

List of environmental layers used in the species distribution models.

Variable Resolution Reference
Bathymetry 30 arc-sec ≈ 780m The GEBCO_08 Grid, version 20100927.
http://www.gebco.net
Slope 30 arc-sec ≈ 780m Derived from Bathymetry, GEBCO_08 Grid, version 20100927.
http://www.gebco.net
Temperature 0.1° ≈ 7km NORWECOM (Skogen et al., 1995; Skogen & Søiland, 1998).
http://www.imr.no/~morten/wgoofe/
Salinity 0.1° ≈ 7km NORWECOM (Skogen et al., 1995; Skogen & Søiland, 1998).
http://www.imr.no/~morten/wgoofe/
Eastward
sea water velocity
0.1° ≈ 7km NORWECOM (Skogen et al., 1995; Skogen & Søiland, 1998).
http://www.imr.no/~morten/wgoofe/
Northward
sea water velocity
0.1° ≈ 7km NORWECOM (Skogen et al., 1995; Skogen & Søiland, 1998).
http://www.imr.no/~morten/wgoofe/
Diatoms concentration 0.1° ≈ 7km NORWECOM (Skogen et al., 1995; Skogen & Søiland, 1998).
http://www.imr.no/~morten/wgoofe/
Flagellates concentration 0.1° ≈ 7km NORWECOM (Skogen et al., 1995; Skogen & Søiland, 1998).
http://www.imr.no/~morten/wgoofe/
Seabed sediments 30 arc-sec ≈ 780m EMODNET Seabed substrate map (1:1 million), EMODNET-Geology.
http://geomaps2.gtk.fi/ArcGIS/services/EMODnet/MapServer/WMSServer
Distance to gravel 30 arc-sec ≈ 780m Euclidean distance calculated from categories Gravel ( GV) and Sandy Gravel ( SDGV)
Marine SeaBed Sediment Map - UK Waters - 250k (DigSBS250). http://www.bgs.ac.uk/discoverymetadata/13605549.html
Distance to coast 30 arc-sec ≈ 780m Euclidean distance calculated from pan50 (Scotland) and britisles (rest of the coastline)

2.3 Species Distribution Models

All analyses were conducted using R 2.15. (R Development Core Team2009, URL: http://www.R-project.org). Two distinct modelling approaches were used, one based on presence-only data, MAXENT, and one based on presence-absence data, Random Forest.

For the presence only approach, the MAXimum ENTropy ( MAXENT) algorithm (Phillips et al. 2006) was used. This model is based on the concept of the ecological niche defined by Hutchinson (1957). It uses different mathematical algorithms to calculate the ecological niche of the target species based on the environmental variable values at the presence point (Monk et al. 2010). The MAXENT method (Phillips et al., 2006; Elith et al., 2011) minimizes the relative entropy between two probability densities (one estimated from the presence data and the other from the landscape) defined in covariate space. After defining the niche, the model projects it into geographic space to produce a predictive map of suitable habitat. In this work the R implementation MAXENT in the R package 'dismo' was used.

The presence-absence based approach was conducted using the Random Forest model (Breiman, 2001). Random Forest is an advanced modification of the Classification and Regression Trees ( CARTs, Breiman, 1984). As the name suggests, Random Forest fits many classification trees to a dataset, and then combines the predictions from all the trees. As described in Cutler et al., (2007), the algorithm begins with the selection of many bootstrap samples from the data. A classification tree is then fit to each bootstrap sample, but at each node, only a small number of randomly selected variables are available for the binary partitioning. The trees are fully grown and each is used to predict the out-of-bag observations, which are those that are present in the original dataset, but do not occur in a bootstrap sample. The predicted class of an observation is calculated by majority vote of the out-of-bag predictions for that observation, with ties split randomly.

To assess the importance of a specific predictor variable, the values of that variable are randomly permuted for the out-of-bag observations, and then the modified data are passed down the tree to get new predictions. The difference between the misclassification rate for the modified and original out-of-bag data, divided by the standard error, is a measure of the importance of the variable.

The presence-absence data for each species was randomly divided into a training subsample (with 90% of the total points) and a test subsample (with the remaining 10%), following the methodology described by Hijmans and Elith (2013). The ability of the training subsample to predict the probability of presence was tested using the test subsample. Moreover, the data were also tested on a year-by-year basis: for each year analysis, the data were divided into a training subsample containing all the years, except the year being used as the test subsample.

The performance of the models was estimated using two different statistics: the Area Under the Curve ( AUC) of the Receiver Operating Characteristic ( ROC, Fielding and Bell, 1997) and the kappa statistic (Cohen, 1960). The AUC varies between 0 and 1, with values higher than 0.9 considered as excellent performance, whereas values between 0.9 and 0.7 indicate good prediction and values lower than 0.7 indicate poor prediction (Hosmer and Lemeshow, 2000). The kappa statistic ranges from -1 to 1, with values higher than 0.75 indicating excellent prediction, values between 0.4 and 0.75 indicating good prediction and values lower than 0.4 indicating poor prediction (Landis and Koch, 1977). This evaluation process was repeated 10 times in each combination of species and model (and once per year in the case of the year-by-year evaluation), calculating the AUC and kappa values each time based in a different random selection of training and test subsamples. Both statistics were calculated using the implementation of evaluate in the R package 'dismo'. The threshold used to compute the kappa value was calculated each time, using the implementation of threshold in the same package. The threshold which provided maximum kappa values was selected and used as the probability of presence above which to identify the most sensitive areas.

For a more technical description of how the model variables were used to predict the probability outputs shown in this report please refer to the Technical Annex at the end of this report.

Contact

Back to top