Data Blog by Lizeo
The level of data accuracy has a significant impact on the results of an analysis. Indeed, the average difference in price for a tire, with or without a manufacturer’s marking (e.g. AO marking for Audi), for a same size, can vary by several percentage points. The inability to distinguish them will distort the analysis of the average price for a specific size.
The lifespan of price data is inherently very short. Therefore, its freshness is a criterion of crucial importance. In e-commerce, the frequency of re-pricing is very high, often day-to-day in the tire industry, and sometimes on an intraday basis in other industries.
The collection frequency is therefore of utmost importance to obtain the most accurate analysis possible: an average price per week will be more accurate if prices are collected daily.
The collection of online consumer price data requires the use of tools which “visit” targeted commercial websites to collect the information needed, without disrupting them (ethical collection).
These tools must be smart and efficient since an e-commerce website is bound to change regularly: new design, new layout, new URL, etc.
These changes have a direct impact on data collection tools: website change = loss of data collection.
In order to keep these tools up to date, you need a team that monitors each website on a daily basis to ensure the operational capability of the system, 24 hours a day, 7 days a week.
Now is the time that ‘Data’ specialist expertise comes into play.
The description of a tire sold online is one of the elements that can vary significantly from one e-commerce website to another.
The list of attributes which constitute the description of a tire and which can vary are as follows:
To be able to attach the prices collected to the right tire, it is necessary to “understand” and decipher the key elements displayed which allow this product to be identified in order to compare them with a ‘specific’ reference base of existing tires (current or in the past) and validated (with an official source attesting it). This is the data unification step.
This referent database is the cornerstone of an efficient matching system to deliver data with an unparalleled level of quality.
The comparison of data collected with this referent database checks that the information found online for this tire exists: does this tire come from this brand? with its technical attributes? etc.
This allows data collected online to be systematically and automatically categorized:
Finally, the most outrageous prices are also filtered (e.g.: a touring tire sold for less than $20 is unlikely to be real)
The combination of a highly qualitative tire database, matching technologies, Machine Learning algorithms and product marketing expertise provides quality data with a high level of completeness.
This will allow you to focus on your business (the analysis) and gain in efficiency. Indeed, analysts and Data Scientists spend between 50 and 80% of their time cleaning data before they can start handling it!
The preparation step converts unified and enhanced data into a set of data (cube) that integrates your business rules, your internal data but also your vision of the market and your segmentation. Indeed, data on competitors’ prices is a key element in understanding the market but internal data is just as important.
Examples of business rules:
Examples of Internal data:
This is the step where the accuracy and consistency in the collection of online data is of particular importance. Indeed, to make sense of these 2 data sets, the granularity of online data must be at the same level. As such, these 2 data sources will be matched at the same level in order to compare the same elements.
This is the last step before starting to ‘consume’ data in business tools.
Why you should trust us with this methodology for analyzing sell-out prices in the tire market?
Lizeo Group supports its customers in their data-driven digital transformation and has been collecting tire price data online on a daily basis for more than 10 years on over 1000 e-commerce sites worldwide.
This represents approximately 11.5 Million price lines per day, in all currencies, which are cleaned, matched, enriched and ready to use.
© Lizeo Group 2024, all rights reserved