Member-only story

Generate Synthetic Data for AI Vision Training

4 min readJun 14, 2022

When building an AI solution to recognize objects as part of a wider AI vision solution one has to realize that around 80% of developing the AI solution most likely will be in collecting and preparing data, determining how much data you will need is a critical first step to correctly estimate the effort and cost for the whole project. A recent study from iMerit outlines the sizing of your learning set in more detail and is worth exploring when trying to calculate the total effort of your development work.

While 80% of the effort is (roughly) in collecting and preparing the data to train your model there is a second catch to be considered. If your model, for example, would require a thousand images to be trained upon this can only be done if those thousand images are available.

In some fields a thousand images of a specific object might not be available. For example in the defence space a thousand images of a new weapon system might not be directly available to you while you want to develop a AI model that is able to detect this specific weapon system when it appears in a wider set of collected data.

The solution to this specific problem, and also to the challenge of building and preparing large datasets in general is synthetic data generation.

Synthetic data generation
The solution to this specific problem, and also to the challenge of building and preparing large datasets in general is synthetic data generation. As per the view of Gartner the use of…

Generate Synthetic Data for AI Vision Training

Written by Johan Louwers

Responses (1)