I recently left Google (8 years, 4 months, 4 days :)) and co-founded a deep learning startup. We’re building a large scale model to learn human behavior and detect, predict, and explain suspicion and fraud. We’re focusing on insurance involving human beings, accidents and their health - such as disability, workers’ compensation and other areas.
I’m a huge fan of fast.ai and Jeremy’s incredible teaching methods. I’ve taken many of the courses up to v3 part 2.
“Insurance” may sound like a boring problem domain, but it’s far from that. On the ethical side, it’s a huge problem, with estimated $40B-$80B paid out to fraud annually, with very little recourse to the fraudsters. And we’re all paying for it. The detection process today is almost entirely manual/human intuition based, not data driven. Our mission is to put a stop to this loop hole by leveraging large historical data sets and deep learning and introducing a modern, systematic real-time detection approach. We put it in the hands of professional detectives. We provide realtime monitoring, alerts, explanations and supporting evidence.
I’ve worked on and built a number of machine learning pipelines at Google and this data problem is definitely on the advanced end of the spectrum. The data set is a time series of events starting with an accident + injury, followed up by many diagnosis, interviews, witness statements, medical reports; treatments, with some final outcome. The outcome can be a full recovery, partial disability or a permanent disability.
This time series is represented by a large structured as well as large unstructured corpus (10s of millions of documents). The problem domain is super interesting - we’re building a real world “Minority Report” to stop bad actors from abusing the system, and help the good actors get paid immediately, and stop bad medical providers from doing evil.
There is a lot of interesting NLP work. The data sets you’d be working with involve individual demographics, job histories, geography mapping, health history, injury causes, recovery and treatment methods, objective witness statements and subjective claimant reports, industry codes, criminology, graph theory and workplace safety, to name a few.
People pull crazy schemes every day and get away with it. Few excellent narratives:
We’re looking for one more data engineer and/or data scientist with strong programming python skills to join the team. We are super early so you’re getting in on the ground floor, but we are well funded (closed a large seed round recently from several top-tier VC’s in SF and NYC). We already have a large data set and a customer and are seeing early success with our initial model. Our office is in the middle of Manhattan.
You’d be working along side industry experts and researchers. We’re building a company culture taking the best engineering and cultural practices from Google and mixing in the freedom of working at a startup (flexible geography, extreme ownership, flat hierarchy, cool off-sites to tropical destinations…).
You can publish white papers and speak at conferences. We’re working closely with Google (TF + cloud teams) so you’ll also get access to that resource. You can use your favorite tools, including PyTorch and FastAI, and get access to powerful VM’s.
If you’re interested in the current position, or want to reach out and stay in touch for when we start hiring again, or want to collaborate because you’re working on a similar problem, connect with me on linkedin (mention fastai) or email me: tomasv at deepfraud.com.
We are currently in semi stealth mode so you won’t find much info about us online.