How to Cope With Limited Data In Deep Learning

Once we had a very little data to train our network which was clearly not enough to get successful learning results. Therefore we had to "create" some data without violating the original pattern.

Let us share our experience with you. Here is the summary of the solution we found.

We wanted to recommend TV programs to the audience based on their preferences by using deep neural networks. To do this, we needed the history of the programs they had watched as the training data. Unfortunately at that time there was no such data recorded by the participants.

Instead, we asked them to order their preferences from 1 to 5 in terms of 13 genres (movie, news, sports, etc.) for weekday evenings, weekend daytime, and weekend evenings. The first preference would mean the genre where programs of all genres are available; whereas the second preference would mean the genre where no program of the first choice is available but all the others are; and, so on so forth.

As an example, we obtained such lists for each participant:

Order (H to L)Weekday EveningsWeekend DaytimeWeekend Evenings
1SeriesMusicMovie
2NewsSportsSeries
3DocumentaryShowMusic
4MovieDocumentaryShow
5SportsLeisureNews

Then, we could be able to generate data by using those reference lists.

The first preference of the user when all 13 genres are available would also mean that the user would make the same choice for all combinations of availability of the other 12 genres, provided that the first choice genre is always present. This yields 2^12 = 4096 alternatives.

Similarly, the second choice of the user when their first choice is not available but other 12 genres are available would also mean that the user would make the same choice for all combinations of availability of 11 genres, provided that the first choice is not available and the second choice is present. This yields 2^11 = 2048 alternatives.

Repeating the same procedure for the third, fourth, and the fifth choices, 7936 data were prepared per person without breaking the logic.

Considering the three different time intervals, we ended up with 23,808 data per person which was sufficient enough to be used in deep learning.

Please contact us for the details if you are interested in hearing more about this.