The dataset
A VAT number is a unique nine-digit identifier that is assigned to a UK business that is registered for value-added tax (VAT).
These numbers often appear on a business receipt or invoice, which helps the UK tax authority, HMRC, to tie that transaction back to the registered company, organization or person providing the service or selling the goods.
The source
We collect GB VAT numbers directly from the primary source HMRC .
The website portal can be accessed here: VAT – GOV.UK
How does this relate to legal entities?
In the UK, VAT numbers are assigned to ‘businesses’ not companies and as such there is not a strict 1-1 mapping with companies.
- A legal entity can have multiple VAT numbers for different businesses operated by the legal entity. In addition it can also use its parent’s VAT number.
- The VAT number can also be transferred when the business is sold
- VAT numbers are also issued to non-companies including sole traders. This becomes more of a problem when working with HMRC’s VAT validation tools as they only provide a company name and address. For this reason we have chosen to resolve only to registered companies that have an strict name match with basic normalisation over case and punctuation and stemming. For example
Foo Bar Ltd
will matchFoo Bar Limited
.
This currently limits us where a companies VAT registration differs from the details they have given Companies House. We are looking to improve upon this in the future to improve coverage.
VAT numbers are standard for various use cases from accounting to KYB (know-your-business). Understanding various external identifiers is important to us to help improve the ability to resolve your entities to our dataset across various use cases
VAT numbers are surprisingly opaque: currently you can only search HMRC if you already have the VAT number itself, and there is no way to search the HMRC database using the business name to find its VAT number.
Latency
We are constantly monitoring for new VAT numbers that have been issued. Due to the restrictive nature of the source there can be a significant delay, it is our mission to keep improving until this data is as near real-time as possible.
To optimize how we collect this data we had to thoroughly understand the broader dataset to such a degree that we built our own list of ‘potentially’ valid VAT numbers by applying the 97−55 check algorithm read more here .
This helped reduce the amount of checks we have to conduct down from 900 mil ~ to 18 mill.
Mapping decisions
The VAT numbers are reconciled against the company records by performing a strict 1:1 company name match.
The VAT number is mapped as a UID within the additional identifier object array as seen below.
"identifiers": [
{
"identifier": {
"uid": "164331133",
"identifier_system_code": "gb_vat",
"identifier_system_name": "GB VAT Number"
}
}
]
Transformation decisions
No transformations to the VAT numbers have been made by OpenCorporates.
Accessing the data
You can explore the data in three main ways:
Website – Displayed in the company data records on our website – you’ll just need to log in (for free) first.
API – If you receive our data via our API, VAT numbers are now displayed as an attribute in company records. You can also search for a VAT number by using identifier_uids as a search parameter for example:
Bulk data – If you receive our data at scale via Bulk downloads, the Alternative Identifiers file you receive will include VAT numbers from now on.