Wednesday, March 22, 2023

Ingredient recommendation with Word2Vec

As a project during pat leave, I used ML to build an ingredient recommender. The model is relatively simple, using Word2Vec and co-occurence counts to recommend ingredients.

One interesting observation is that the model appeared to perform better using smaller ingredient embeddings. For example with 100-dimensional embeddings, "carrot" was similar to "bone." But with 16 dimensions, that was less true.

The base embeddings already had some intuitive ingredient recommendations. For example, if you asked for an ingredient that goes with "kiwi and banana," "yogurt" was a top answer. Or if you asked what to do with "lettuce, and feta cheese", it suggested a wrap.

To make the embedding into a more fleshed out recommender, I calculated how often ingredients appeared together. This then allowed the recommender to find ingredients that are similar to each other but don't occur often (as those would be more "novel").

On a few examples, this seemed to work, as it recommended "kumquats" for oranges.

If you are interested in the details of the training, I wrote up a notebook

And if you want to try the recommender directly, you can use it here.