Introduction
The Department for Transport’s Transparency Returns represent a significant step towards increasing visibility and accountability in local highway asset management across England. They were introduced in a context where there is strong political commitment to transparency, including the ministerial promises of comparative ratings. Authorities were asked to publish data locally, with a proforma provided for guidance, but with freedom to present the data in whatever format they chose, provided it was publicly available by a specified date.
Only later did it become clear that the data would be used to construct national ratings for condition, spend and best practice. This sequencing – collecting data before its analytical use was fully specified – is understandable from a policy perspective, but it has important consequences for the quality, comparability and defensibility of the resulting evidence.
The experience of the National Highways and Transport (NHT) Network in manually collecting data from all the authority websites provides a particularly clear illustration of these consequences. The effort required to reconstruct a single usable national dataset from many differently structured local publications is itself a signal that the current collection approach is not well aligned with national analytical use.
This paper draws on a system-level analysis of the Annex A fields actually used by DfT in its published ratings. It does not assess highway performance. Instead, it asks whether the data submitted is sufficiently complete, coherent and comparable to support the use DfT has made of it, and what changes would most effectively strengthen the system going forward.
Scope and approach
The analysis focuses exclusively on those fields that DfT has used in its ratings: red and amber condition for A and BC roads and red condition for U roads; planned capital maintenance spend and DfT capital allocation; and the preventative maintenance share. The aim has been to understand how well the data supports DfT’s published methodology, rather than to form any judgement on authorities themselves.
The work examines completeness and missingness, logical coherence within the data, adherence to expected bounds and scales, patterns of year-on-year change, and recurring data quality signals such as rounding, carry-forward and co-missing fields. The emphasis throughout has been on system-level behaviour rather than individual authority outliers.
What the analysis shows
Overall, the data is not “bad”. Most values are structurally valid and many fields are reported consistently across most authorities. However, several persistent patterns indicate weaknesses in the design of the collection system rather than in authority behaviour.
A first issue is structural incompleteness. Some of the fields that DfT relies on for its ratings are routinely missing for a significant minority of authorities. Around one in ten authorities are missing at least one key condition value in the most recent year, rising to nearly one in five in earlier years. This pattern is stable and widespread, suggesting that the system makes it too easy to submit incomplete data for metrics that are central to the ratings.
A second issue concerns internal coherence. There are cases where reported proportions of red and amber condition add up to more than the total possible share. These are mechanical contradictions rather than differences of interpretation or performance, and their presence indicates that the collection system currently allows logically invalid data to enter the evidence base used for published ratings.
There is also evidence of ambiguity in scale and interpretation. While most condition values sit within plausible numeric ranges, a small but persistent proportion appear to reflect inconsistent use of scale, for example proportions being entered where percentages are expected or vice versa. More importantly, even where the numbers look similar, it is not always clear that all authorities are measuring exactly the same thing, for example the road condition survey method.
The spend data presents a different but equally significant comparability issue. There are many instances where planned capital maintenance spend differs from DfT capital allocation by orders of magnitude. While some variation is to be expected, differences of this scale are unlikely to be explained purely by investment choices or network size. They more plausibly reflect inconsistent definitions of what counts as capital maintenance, mismatched financial years, or inconsistent units.
Finally, several metrics show patterns of being carried forward unchanged across years, and there are instances of large year-on-year changes occurring across many authorities at once. These patterns are more consistent with reporting or interpretation changes than with real-world change, and point to the absence of defined questions and a lack of guidance that would allow such changes to be understood and managed.
The underlying cause: how the system was designed
These issues are best understood not as failures of authorities, but as predictable consequences of how the Transparency Returns were set up.
Authorities were required to publish data locally, but allowed to do so in any format they wished. As a result, some used the DfT proforma, some adapted it, others created bespoke tables, and others embedded data in PDFs or reports. There was no standard submission mechanism and no opportunity for validation or consistency checking at the point of entry.
The approach taken by DfT prioritised local publication and flexibility, which made sense in a transparency-first context. However, it also meant that national aggregation, validation and quality control could only happen after the fact, placing a substantial burden on DfT. Making it difficult to ensure that the data is truly comparable and defensible for national ratings.
In effect, the system optimised for local presentation rather than for national evidence production. The data quality issues observed flow naturally from that choice.
What the data is realistically suited to
Given how it has been collected, the Transparency Returns are currently best suited to understanding system-wide patterns, trends and pressures, rather than to rate or rank individual authorities. Collectively they are only valuable for identifying where condition or investment is under strain nationally, for understanding the scale and variability of practice, and for supporting policy development and modelling.
The system is less well suited, in current form, to treating small differences between individual authorities as robust performance signals. This is not a criticism of the data, but a recognition of the relationship between collection design and analytical use.
How the system could be strengthened
Most of the weaknesses identified can be addressed without increasing reporting burden, simply by changing how the data is collected rather than what is collected.
Relatively modest validation improvements would remove a large class of errors, for example preventing submission where red and amber condition exceed the total possible share, enforcing non-negativity and bounds, and introducing soft warnings for extreme spend-to-allocation ratios. Making fields that are directly used in DfT’s ratings mandatory, or requiring explicit justification when they are missing, would significantly improve completeness.
The guidance could be made more operational by explicitly specifying denominators, asset scope and financial alignment, and by including worked examples for metrics that are difficult to derive consistently, such as preventative maintenance share. Explicitly stating how each field is used by DfT would help authorities focus their effort where it matters most.
Form design improvements, such as enforcing units and scale in the input itself, removing silent carry-forward without review prompts, and structuring the form to reduce partial completion, would further improve consistency without adding complexity for authorities.
Finally, relatively light-touch governance improvements, such as capturing whether values are measured or estimated, recording sign-off responsibility, and versioning the form and guidance, would make it much easier to interpret changes over time and maintain confidence in the system.
A particular opportunity: a sector standard input form
A particularly significant opportunity arises from the fact that NHT has already developed an Annex A input form, which some authorities have used successfully to publish their returns. This is built on the experience of setting up and running the largest performance benchmarking network in the sector. Where the input data feeds directly into an SQL database.
This is not a theoretical solution but a working one, already aligned with authority practice and sector norms.
Rolling this out nationally would transform the collection system with minimal additional development. It would enable validation at the right point, enforce structure and units, and dramatically reduce the burden of national aggregation, while remaining sector-led rather than centrally imposed.
Conclusion
The Transparency Returns have made an important contribution to transparency in local highway asset management. However, the current collection approach prioritises publication over evidence quality. The data quality issues observed are not primarily the result of authority behaviour, but of a decentralised, unvalidated collection model.
By modestly redesigning the collection system – particularly through standardisation and validation – DfT can substantially improve the quality, usability and credibility of the data, while reducing burden on both authorities and data users. NHT is well placed to support this transition, building on its existing tools, relationships and expertise.



