Stable Diffusion in Portuguese

Hi.
Like 99,99% of the DL models, the Stable Diffusion was mainly trained with English texts. It is therefore necessary to do prompt engineering in English (or use a translator to obtain a prompt in English).

But maybe it is not useful for some languages ​​either because of their linguistic proximity, or because texts in these languages ​​were used when training the model (or one of its components as CLIP).

To check it in Portuguese, I took the prompts in English from the notebook stable_diffusion.ipynb from fastai, I translated them via Google Translate and I ran the same code. Here are the results.

Apart from the negative prompt which not only did not work (the blue remained) but also generated a different image, the results are quite good compared to the same images generated with an English prompt.

It would be interesting to test all the other possibilities of Stable Diffusion with texts in Portuguese and to post the results and comments.

Another post on this subject of Stable Diffusion in a language other than English: Share your work here ✅ (Part 2 2022) - #15

# English prompt: a photograph of an astronaut riding a horse
prompt = "uma fotografia de um astronauta andando a cavalo"
pipe(prompt).images[0]

# English prompt: Labrador in the style of Vermeer
torch.manual_seed(1000)
prompt = "Labrador no estilo de Vermeer"
pipe(prompt).images[0]

# English prompt: blue
torch.manual_seed(1000)
pipe(prompt, negative_prompt="azul").images[0]

# English prompt: Wolf howling at the moon, photorealistic 4K
torch.manual_seed(1000)
prompt = "Lobo uivando para a lua, 4K fotorrealista"
images = pipe(prompt=prompt, num_images_per_prompt=3, init_image=init_image, strength=0.8, num_inference_steps=50).images
image_grid(images, rows=1, cols=3)

# English prompt: Oil painting of wolf howling at the moon by Van Gogh
torch.manual_seed(1000)
prompt = "Pintura a óleo de lobo uivando para a lua por Van Gogh"
images = pipe(prompt=prompt, num_images_per_prompt=3, init_image=init_image, strength=1, num_inference_steps=70).images
image_grid(images, rows=1, cols=3)

8 Likes

@pierreguillou that’s extremely cool to know :slightly_smiling_face:. I

Hi @pierreguillou I also tried in Portuguese and it worked most of the times. Where I noticed a strong difference was in “cultural” bias, but that is expected as the internet as a whole has a cultural bias. I mean, if you look for Brazilian artists styles, you will have worse results than European/American ones.

artists = [“Alfredo Volpi”, “Burle Marx”, “Stephan Doitschinoff”, “Bicicleta sem Freio”, “Gilvan Samico”, “Athos Bulcao”]


[artistsI = [“Salvador Dali”, “Paul Gaugin”, “Andy Warhol”, “Francis Bacon”, “Francisco Goya”, “Vermeer”]

2 Likes

Veja só:

prompts = ["uma foto de um astronauta andando a cavalo"]*6
ptBR_imgs = pipe(prompts, num_inference_steps=30).images
image_grid(ptBR_imgs, 2,3)

prompts = ["uma foto de um astronauta montando um cavalo"]*6
ptBR_imgs = pipe(prompts, num_inference_steps=30).images
image_grid(ptBR_imgs, 2,3)

I’ve experimented with chatGPT using several prompts both in Portuguese and English, trying to identify patterns in differences in resulting generated texts.

The main finding was that, at least for chatGPT, it would seem that both the prompt and the resulting text are being translated from any language to English and that the generative core only operates in English!
The evidence for this conclusion comes from experiments with prompts for writing poetry in specific forms with explicit attention to rhyme and metric (sonnets, alexandrin…). Those structural properties of the poem are somewhat respected in English but not at all in Portuguese, but back-translating the Portuguese free-form verses to English frequently leads to rhyming verses which suggest that the structural prompts were effective but were simply ‘lost in translation’…
This evidence is still somewhat fragile and should be strengthened to make sure there really are distinct stages of translation in the generative pipeline and also whether this also happens in generative models other than chatGPT. But it certainly makes sense from an architectural perspective to ensure homogeneous performance and content filters, in any language for which automatic translation is good enough.

1 Like