Business idea with NLP

Business idea:

To offer the service of “cleaning” the list of materials a company has on its ERP system, so strings like:

Example 1 (English)

CAS SYSTEM DUAL FREQ LIGHT VEHICLE CAS SYSTEM DUAL FREQ LIGHT VEHICLE -GEN 2 PROD0240 G1 SYSTDFL01-VI GENERAL ELECTRIC

should be converted into something like that:
CAS SYSTEM DUAL FREQ LIGHT VEHICLE GENERAL ELECTRIC PROD0240 GEN 2 PROD0240 G1 SYSTDFL01-VI

Example 2 (Spanish)

ALUMBRADO PUBLICO 86X295X492MM 120WATTS ALUMBRADO PUBLICO 86x295x492MM MODELO GREEN VISION XCEED N/P PH643120W MARCA PHILIPS

should be converted into something like that:

ALUMBRADO PUBLICO PHILIPS PH643120W MODELO GREEN VISION XCEED 86X295X492MM 120W "

Basically a good start would be to comply with this options:

  1. Remove repeated words
  2. Clean empty spaces
  3. Complete truncated words
  4. Order words in a meaningful way (Noun + modifier + part number)

I know companies that offer that service and they do it manually, maybe with NLP (perhaps transformers) we could compete in that market.

Who’s in?