What makes good data?

by John Millar | August 12, 2022 | 5 min read

Home / Insights / Project Management / What makes good data?

It is far from secret that, in many industries and certainly the construction industry, data is the new oil.

Recent years have seen our industry move away from conventional paper based documentation – the prevalent but deeply flawed currency of valuable information for many decades – towards digitalised approaches, which replace the inherent limitations and troublesome nature of the former with the versatility and reliability of the latter.

Data – aggregations of raw facts which serve as the basis of information when given context – comes in many shapes and sizes, but regardless of a dataset’s specific form, purpose, status or content, there are certain universal characteristics which determine its overall quality and suitability for use.

As part of any holistic, outcomes focused approach, those involved in the production and management of data should strive to understand and maintain awareness of these characteristics in order for their data to be fit for purpose and add value in the manner intended. Here we look at some of these in more detail.

Accuracy and Precision

Perhaps the most fundamental measurement of data quality, a high degree of accuracy in the data is necessary in order for it to be fully representative of its subject matter. Whether that be delivery phase data that communicates the design intent, or operational data that reflects the live performance of the built asset and its constituent systems.

This accuracy is critical in the utilisation of the data, and its importance is neatly summarised in the infamous GIGO principle: Garbage In, Garbage Out! Valuable outputs depend on reliable inputs and the structure and process in between is immaterial.

Precision is distinct from accuracy in that it is not the overall conformance to the truth that is being measured, but rather the degree of exactness in this conformance. Take an architectural example; Have the building’s dimensions or gross internal floor area values been provided as integers or to two decimal places? Standards for precision can be prescribed or subjective depending on the needs of the project, but either way, should be consistently upheld.

Completeness and Timeliness

Another fundamental measurement of data quality is completeness; any given dataset at the point of delivery should contain all of the required data.

This is accounted for in the ISO 19650 methodology, where the project’s Asset Information Requirements (AIR) precisely define the required content of the built asset’s structured data. Whilst the project’s gateways / decision points (e.g. the stages of the RIBA Plan of Work 2020) provide a temporal framework for defining how this data should develop over the course of the project’s delivery phase. In addition to where periodic audits can be scheduled to ensure sustained development and commitment to the relevant information requirements.

A deep understanding of these requirements is necessary to ensure that the dataset delivered will be fit for purpose. Missing data points can create exactly the type of issue that the provision of structured data seeks to solve, where unavailability (e.g. missing model number, warranty or spare parts data for a faulty boiler) can result in the need for manual investigative processes as part of the job order. These carry implications for the total time, cost and carbon invested, particularly where the fault in question is likely to lead to significant disruption and operational downtime.

As always, the right data is needed at the right time in order to fully realise the benefits of digitalised methodologies, thus careful consideration of information requirements is key to a successful outcome.

Relevance and Consistency

Although this may seem an obvious consideration, it is important to ensure that the data provided is relevant to the needs of those who will be using it.

Relevance is subjective, and the only way to guarantee this quality in the data is via a close and sustained engagement with the end user(s), which should include comprehensive discussion on both existing and potential practices.

The production and management of redundant data is wasteful for all involved and careful elicitation of information requirements will ensure that only that which is useful will be produced. All that is produced should be consistent with the formats adopted elsewhere, for example in the other datasets utilised by the end user in asset management at the portfolio level.

Versatility and Interoperability

One key benefit of structured data is in its flexibility and versatility. When freed from the restrictions of an unreliable, unsearchable and easily degradable paper based format – or a collection of unorganised PDFs on a pen drive which isn’t much better! – data is able to flow from one application to another with no loss or degradation and no need for significant human intervention.

Open data formats such as Industry Foundation Classes (IFC) and Construction to Operations Building Information Exchange (COBie) are widely recognised within the wider ecosystem of digital tools that can be implemented over the entire lifecycle of the built asset and this enables an impressively wide array of potential use cases. This versatility adds considerable value for any organisation looking to fully embrace the digital economy, and thus the interoperability of the data provided will always be of great importance.

With some consideration of these universal characteristics of data quality, all involved in the production and management of data can ensure that the datasets they are providing will be reliable, fit for purpose and will add maximum value for the projects and clients for which they serve.

To learn more about good data, why we need it and where to find it, contact John Millar directly at [email protected].