Language2motion project in Swift

wojtekcz · April 25, 2020, 3:06pm

Hi all,

I’ve started the language2motion project with goal of creating multi-modal implementation of Transformer architecture in Swift. It’s a learning exercise for me and an attempt to answer the question if Swift for Tensorflow is ready for non-trivial work.

The use-case is based on a paper "Learning a bidirectional mapping between human whole-body motion and natural language using deep recurrent neural networks" by Matthias Plappert. He created a nice dataset of few thousand motions “The KIT Motion-Language Dataset”.

Feel free to check it out and contribute.

saeta · May 1, 2020, 4:14pm

As you encounter issues, please do reach out on the mailing list (swift@tensorflow.org) or here. Excited to see your progress!

Michal_w · May 7, 2020, 7:49pm

Hi Wojtek

Just followed your project on GitHub nice approach on image motion visualization and description
Also downloaded labels description and fork your GitHub.
Labels was created by multiple persons as you probably aware there are multiple descriptions the same activity.

[‘a’, ‘person’, ‘is’, ‘walking’, ‘forwards’]
[‘a’, ‘person’, ‘walks’, ‘4’, ‘steps’, ‘forward’]
[‘a’, ‘human’, ‘walking’]

For me those labels have very similar meaning.
Also you can see words stats there was 1775 different words used in label vocabulary.

If we look on counters of words

[(‘a’, 7235),
(‘person’, 4262),
(‘the’, 2259),
(‘walks’, 2248),
(‘and’, 1524),
(‘forward’, 1390),
(‘to’, 1338),
(‘is’, 1257),
(‘human’, 1140),
(‘right’, 1098),
(‘left’, 1036),
(‘steps’, 991),
(‘with’, 876),
(‘walking’, 868),

There is only handful of useful words which describe motion.

NLP is whole new for me hope I can help in Python and only small bits in Swift (whole new area for me)

cheers

Michal

Michal_w · May 12, 2020, 12:03pm

Hi,

Just finished work with labels simplified them an unified

github.com

mwawrzyniuk/language2motion/blob/labels/code/dataset_labeling/filtered_labels.txt

00001: walk forward
00002: walk forward
00003: walk
00004: walk forward
00004: goe forward
00005: walk forward
00006: goe forward
00006: walk forward
00007: walk
00007: walk forward
00008: walk forward
00009: stand walk
00009: left do forward walk
00009: make forward
00010: walk backward
00011: left backward walk
00011: walk backward
00013: walk backward
00014: walk backward
00015: walk backward

This file has been truncated. show original

Netbook with work progress.

github.com

mwawrzyniuk/language2motion/blob/labels/code/dataset_labeling/l2m_labels_cleaning.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from collections import Counter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import nltk"
   ]
  },

This file has been truncated. show original

Cheers

wojtekcz · May 21, 2020, 5:38am

Hi Michał,

Nice NLTK work I suppose you can select few combinations of tokens, output a set of labels and plug them into one of models we have: BERT-language2label, ResNet-img2label or ResNet-motion2label. I’m curious how your labels will perform.

wojtekcz · May 24, 2020, 8:49am

@Michal_w
I’ve uploaded 2 new processed datasets for your X10 1-channel ResNet performance tests:

Michal_w · May 24, 2020, 10:43am

Superb will update with progress on moving to X10 and performance