Now that you know what the "STATs19" database is for, please consider a few questions about the police-collected data (the definitive dataset):
1. What is the main purpose for the police activity that leads them to fill in a long form with information about a collision?
You may assume that the primary purpose is to inform the injury prevention effort. However this is not the main purpose. It is a useful by-product of police attendance at a collision that we get all this information, but we do need that it is just a by-product. There is actually an "On The Spot" study, carried out around Loughborough involving the Transport Research Laboratory in which detailed collision investigation teams visit scenes to collect prevention information.
The main purpose for police involvement is to determine whether an offence has been committed. There is a considerable amount of "first aid": warning other road users of a crash site so that there aren't further collisions, keeping the scene safe for any other emergency services and dealing with distraught people and so on.
2. Why should we be cautious when interpreting the data from such police-collected forms?
Many if not all "official" statistics are the byproduct of a process. The involvement of police at the scene of a road collision (or the involvement of police in a station if a collision is reported there) is not primarily geared towards injury prevention.
If a person works with a prosecution mentality, all conclusions have to be drawn "beyond reasonable doubt". It is hard, without proper accident investigation procedures, to determine the speed of vehicles prior to a crash. So it is entirely reasonable to expect the forms to be filled in with some caution. Important causal factors, such as Excess speed may not be entered. This is not because excess speed wasn't a factor, but because it couldn't be determined beyond reasonable dount in a collision involving minor injuries only that excess speed was a factor. So when interpreting these data, we need to be mindful of the filters that were applied when the forms were completed.
3. What are the two conditions required in order for an event to appear in the database?
(a) the collision participants need to tell the police
(b) the police need to agree that an injury has resulted from a collision involving a vehicle on a public highway.
4. A number of possible reasons are listed below which might either result in someone not informing the police, or the police not recording an injury as a result of a road collision. Which is NOT a reason that prevents a collision being recorded and is fact a reason for a non-injury collision to be recorded?
a) An uninsured driver would prefer not to inform the police of a collision.
Estimates vary, but there are suggestions that 1 in 10 motorists are not insured. If you could avoid it, you'd probably prefer not to have a long discussion with a police officer about a road collision if you were uninsured.
b) Someone does not realise that crashes involving bicycles have to be reported. All collisions involving a vehicle, motorised or not, which result in injury should be reported.
The "rules" regarding the way data should be reported an exactly how it should be recorded are rather complex. As road users, we are rarely involved in a collision. Many police officers are only required to attend a small number of road collisions every year. It's really asking for train-spotter / stamp collector mentality (who said statistician mentality) to understand all these rules in all their fine detail. So it may be that with the best of intentions, not every detail of every eligible collision gets recorded entirely accurately. If you cut yourself having come off a bicycle when hitting the curb, are you sure you'd expect it to be reported and recorded?
c) It does not appear that anyone is injured - and few of us realise the official definition of Injury.
A very long standing criticism of the STATs19 data are that "injury" is defined in a very strange way, and is not assessed by a medically trained person. So, with the best of intentions, the precise severity of a collision may not be recorded accurately. This could result in a "slight" injury collision not being recorded at all, or a "serious" injury collision being recorded as a "slight" injury.
d) A driver in a very minor collision in which no-one else is injured decides they can "invent" an injury in order claim compensation.
This is one example where you might have records of collisions where you should not.
It is generally thought that the STATs19 (police collected) data are pretty much entirely correct as far as goes records of fatal crashes. It is thought the accuracy of the records is lower for serious injury crashes, and lower still for slight injury collisions.
A rather famous analysis reported in the British Medical Journal caused quite a stir when it was published. Very careful follow up of this report (by organisations such as the Statistics Commission - who do quality check National Statistics) has suggested that perhaps the police are tending to report injuries as slight where before they would have recorded serious, and that the apparent fall in serious crashes is not as good as it appears.
Obviously, we would wish to have one perfect and definitive dataset. But in the world of official statistics, this can never happen. As mentioned at the start of this section, that is because Official Statistics are the byproduct of some administrative process. Whilst the data are not perfect, we can be reasonable sure that there is no large scale fraud or manipulation. So, we have to interpret the data with some caution, remember that the police focus is on prosecutions and that minor injuries involving non-motorised vehicles are most likely to be undercounted. But otherwise, we have a definitive dataset that has been collected in good faith, and with care can assist us in determine what is happening on the highway and provides a starting point for us to decide what we should do about it.