AI in research and innovation — an experiment

Rea­ding time: 13 minu­tes

Are we mar­ket rese­ar­chers, inno­va­tion deve­lo­pers and desi­gners still nee­ded at all, or is AI taking over our jobs because it can do it bet­ter? Vice versa, can AI be of any use to us at all in our pro­fes­sio­nal pro­ces­ses, or is the best way a happy win/win team? We explo­red this ques­tion in an experiment.

The dis­cus­sion about arti­fi­cial intel­li­gence (AI) ran­ges from hyped expec­ta­ti­ons to prac­ti­cal use cases to dis­ap­point­ment. Alt­hough we have been using GenAI in our pro­jects for about a year and have gai­ned a lot of expe­ri­ence, we wan­ted to know exactly where we can use AI effec­tively in our work and where rely­ing on HI (human intel­li­gence) is the bet­ter choice.

What does “our work” mean? Tog­e­ther with our cli­ents, we deve­lop inno­va­tions, pro­to­ty­pes or design manu­als based on psy­cho­lo­gi­cal-crea­tive rese­arch. To do this, we work in an inter­di­sci­pli­nary way using the Insight­Art mind­set we deve­lo­ped our­sel­ves. We think rese­arch and inno­va­tion tog­e­ther — both merge into one pro­cess: inno­va­tion-led rese­arch and rese­arch-led inno­va­tion. Howe­ver, the fin­dings from our expe­ri­ment can cer­tainly also be applied, at least in part, to rese­arch and deve­lo­p­ment pro­ces­ses of a dif­fe­rent nature.

As a refe­rence for the expe­ri­ment, we used seve­ral of our pro­jects car­ried out purely with HI in “manual work” — where we went through the entire R&D pro­cess: Psy­cho­lo­gi­cal rese­arch, inno­va­tion deve­lo­p­ment, com­mu­ni­ca­tion and design crea­tion. We had dif­fe­rent Gene­ra­tive AIs per­form each step in com­bi­na­tion — ChatGPT, Neu­ro­flash, Mid­jour­ney and Adobe Fire­fly were used. We per­for­med the tasks in dif­fe­rent set­tings and com­pared each with the results from the HI projects:

  1. AI as rese­ar­cher:
    The AI was ins­truc­ted to pro­vide us with rese­arch results on the respec­tive rese­arch topic wit­hout prior fee­ding of information
  2. AI as respon­dent:
    Here, the AI was asked the same ques­ti­ons we asked human respond­ents in psy­cho­lo­gi­cal interviews 
  3. AI as unin­for­med idea deve­lo­per:
    The AI was asked to deve­lop ideas wit­hout know­ledge of rese­arch findings
  4. AI as a free idea deve­lo­per:
    The AI recei­ved infor­ma­tion from us before­hand from the rese­arch, e.g., per­sona descrip­ti­ons, and was asked to freely deve­lop ideas based on this information
  5. AI as a par­ti­ci­pant in a crea­tion work­shop:
    The AI went through a stra­te­gic crea­tion pro­cess based on our work­shop con­cepts with pro­ce­du­res and tech­ni­ques deve­lo­ped spe­ci­fi­cally for the task
  6. AI as desi­gner:
    The AI recei­ved a crea­tive brief from the rese­arch fin­dings and was asked to deve­lop design ideas for com­mu­ni­ca­tion and exe­cute them (ChatGPT wri­tes the ins­truc­tions — the prompt — for the image gene­ra­tion AI Midjourney)

In each set­ting, the promp­ting was sys­te­ma­ti­cally varied (see box below for details of the test settings).

AI as a researcher

Do we still need to do our own rese­arch? The results that ChatGPT spit out line by line and in bul­let points weren’t that far off from our results. Was the AI per­haps also trai­ned with our rese­arch results? 

On second glance, howe­ver, it was noti­ceable that lis­ting the needs and moti­ves for why someone would buy insu­rance for his or her cell phone or book a health vaca­tion, for exam­ple, did­n’t help us. Which of the needs is essen­tial or per­haps just a sub-item of ano­ther need was not appa­rent. Com­pared to our results, the under­ly­ing psy­cho­lo­gi­cal core was com­ple­tely miss­ing, which could also only be deter­mi­ned from the con­text of the zeit­geist.

Nor was it pos­si­ble to iden­tify the psy­cho­lo­gi­cal struc­ture that is essen­tial for us: What are the con­flic­ting moti­ves, what are the dyna­mics and tur­ning points at which appr­oval of the pro­duct turns into con­tra­dic­tion. Also miss­ing were atmo­sphe­ric qua­li­ties and a fit­ting into the cul­tu­ral con­text — all these fun­da­men­tal “insights” for under­stan­ding through psy­cho­lo­gi­cal ana­ly­sis could not be found in the bul­let point list.

Con­clu­sion: plau­si­ble and not enti­rely wrong, but too super­fi­cial. An Inter­net search on the topic would pro­ba­bly have given us simi­lar results. We often do such preli­mi­nary inves­ti­ga­tion to get into a topic for wri­ting the Gui­de­line for the inter­views with humans. If we do this inter­net search our­sel­ves, it takes much more time, but has the advan­tage of kno­wing what sources the infor­ma­tion comes from. ChatGPT looks for the most likely ans­wers from some­where. If the AI has been fed with a lot of repu­ta­ble data from alre­ady exis­ting stu­dies on the topic, the results are good and can replace a time-con­sum­ing own preli­mi­nary inter­net search. Howe­ver, espe­ci­ally in inno­va­tion rese­arch, it often hap­pens that one rese­ar­ches a com­ple­tely new topic for which there is not much relia­ble know­ledge yet.

AI as respondent

In this set­ting, the short­co­mings were appa­rent from the first rea­ding — sort of bet­ween the lines. Some­thing simi­lar came across to us when we asked our respond­ents in a study on AI jour­na­lism that we con­duc­ted last year. “Somehow … so love­l­ess” was the response, or “This was defi­ni­tely writ­ten by someone who was­n’t at the event hims­elf, so copied from the hear­say”. An exam­ple: ChatGPT was given les­sons on a type of vaca­tio­ner and asked, “Put yours­elf in the shoes of the ‘Dis­sa­tis­fied’ and describe how you feel about life.” The ans­wer:

“My atti­tude toward life is cha­rac­te­ri­zed by a con­sis­tent dis­crepancy bet­ween cur­rent life cir­cum­s­tances and per­so­nal expec­ta­ti­ons or desi­res. It mani­fests its­elf in a fee­ling of incom­ple­ten­ess, a per­ma­nent dis­sa­tis­fac­tion with exter­nal fac­tors. There is a desire for change, but often wit­hout a clear path to its rea­liza­tion.” 

All true. But: Nobody talks in such a stil­ted way!

Com­pared with the ans­wers from the real con­ver­sa­ti­ons on the same ques­tion, we noti­ced that the human respond­ents ten­ded to use more meta­phors, often thought back and forth, rephra­sed what they had said before — as if they were actually phy­si­cally and sen­sually reflec­ting on their fee­lings at the moment of describ­ing them. This prompted us to give ChatGPT the same task again, but asked for the use of metaphors. 

Here’s part of the result — only part, because ChatGPT did­n’t stop at all …
“My life fee­ling is like I’m con­stantly wal­king in shoes that are too tight, while navi­ga­ting a forest of dis­ap­pro­ving sta­tues that con­stantly raise their fin­gers. My inte­rior resem­bles an aban­do­ned library, where dust and shadows over­lay once-vibrant sto­ries. The days pass like end­less train cars that I view from the plat­form of a train sta­tion wit­hout ever set­ting foot inside…”

If you would like to read an inte­res­t­ing book about the com­plex human achie­ve­ment of fin­ding the right methapors, ana­lo­gies or idi­oms, we recom­mend Dou­glas Hof­stad­ter “Sur­faces and Essen­ces: Ana­logy as the Fuel and Fire of Thin­king”. Some meta­phors ChatGPT mana­ges quite well, because it can simu­late very well that it would have fuel and fire of thin­king. But with lon­ger texts one noti­ces: It has not expe­ri­en­ced its­elf, not per­cei­ved its­elf or acted its­elf, but only recom­bi­ned and varied its word crea­tion and one also sen­ses this — somehow — bet­ween the lines.

Con­clu­sion: AI is a good simu­la­tor, but it is not good for authen­tic descrip­ti­ons of expe­ri­en­ces, actions and fee­lings, because a com­pu­ter pro­gram has no body, no sen­ses and also no fee­lings — even if it can often pre­tend this well. This helps us in the psy­cho­lo­gi­cal ana­ly­sis of AIs, but not in that of human expe­ri­ence and behavior.

AI as an unin­for­med idea developer

As a com­pa­ri­son for this task to the AI, we had a list of 30 ideas descri­bed in detail for a spe­ci­fic topic that we had deve­lo­ped for a cli­ent. These were also alre­ady sor­ted by: small impro­ve­ment, a big impro­ve­ment, and a dis­rup­tive idea. ChatGPT mainly named ideas that alre­ady exis­ted on the mar­ket — we also had done exten­sive back­ground rese­arch. One or two ideas cor­re­spon­ded to our small impro­ve­ments, e.g.: “Per­so­nal con­nec­ti­vity advi­sor: each cus­to­mer recei­ves a per­so­nal advi­sor who helps find the best offer based on the indi­vi­dual usage pro­file.” Veri­vox says hello.

Addi­tio­nally, ideas came up that we did­n’t list because they did­n’t match the rese­arch fin­dings, such as: “Inte­gra­ted vir­tual rea­lity com­mu­ni­ca­tion: ins­tead of a simple video call, cus­to­mers could put on VR glas­ses during the cus­to­mer con­sul­ta­tion” Tele­phone and email etc. were also inven­ted so that you don’t have to talk face to face with peo­ple like an agent with whom you don’t want a per­so­nal rela­ti­onship. The­r­e­fore, the video call alre­ady pro­ved to be cri­ti­cal, three-dimen­sio­nally it should be even more so. 

Con­clu­sion: Rather no help, because mainly obvious, con­ven­tio­nal ideas are crea­ted or miss the needs of the consumers.

AI as a free idea developer

In this task, the AI recei­ved the results of the real psy­cho­lo­gi­cal study in advance and was sup­po­sed to deve­lop ideas on this basis, i.e. simi­lar source mate­rial as we our­sel­ves had when deve­lo­ping ideas. For the most part, the ideas no lon­ger missed the needs, but ins­tead often took up the infor­ma­tion from the study in a rather one-dimen­sio­nal way, e.g., for the health vaca­tio­ner who is stres­sed: “Forest yoga and medi­ta­tion for rela­xa­tion: Spe­cial plat­forms or clea­rings in the forest reser­ved for yoga and medi­ta­tion ses­si­ons.” The forests around spas are alre­ady half-cle­ared, so many such yoga plat­forms alre­ady exist there.

Con­clu­sion: the ideas are too obvious and have a low level of crea­tion. Often they are just descrip­ti­ons of an insight as an idea. What was inte­res­t­ing for us here was that the fee­ding with results from the study appar­ently limits the crea­ti­vity of the AI — simi­lar to what we know about human par­ti­ci­pants from crea­tion work­shops. We the­r­e­fore addres­sed this pro­blem in the next setting.

AI as a par­ti­ci­pant in a crea­tion workshop

Humans are crea­tures of habit and rely on fami­liar thin­king pat­terns. This also has great advan­ta­ges, because rely­ing on the tried and true is safer and also usually goes fas­ter — rou­ti­nes are also mind-eco­no­mic­ally effi­ci­ent. Howe­ver, this pre­vents out­side-the-box thin­king. For this we use pro­ce­du­res for the so-cal­led “Crea­tive Des­truc­tion” in crea­tion work­shops. Such a self-deve­lo­ped method and suc­cessfully tes­ted in many work­shops is our “Nimmo” tech­ni­que. We played through this with ChatGPT, first on a fic­tional exam­ple: improve a washing machine by first lis­ting the advan­ta­ges of an apple com­pared to a washing machine and then trans­fer­ring them back to the washing machine, thus coming up with new ideas. For veri­fi­ca­tion, we crea­ted such a list our­sel­ves using HI.

ChatGPT’s first posi­tive reac­tion: While human par­ti­ci­pants are often irri­ta­ted at first and some con­sider it non­sense to compare app­les with washing machi­nes, ChatGPT was imme­dia­tely wil­ling to coope­rate. Here, the advan­tage of the AI beco­mes appa­rent that it does not have any limi­ting thin­king pat­terns — if you do not give it such as the rese­arch results in the pre­vious set­ting. At first, the ideas for the bene­fits of the apple were still rela­tively obvious and only pro­du­ced a por­tion of the ideas on our list. Finally, by asking the AI to look for more distant ideas, it easily sur­pas­sed our list.

Howe­ver, the second part of the task — to deve­lop ideas for a bet­ter washing machine from the advan­ta­ges of the apple — then drifted too far into the fan­ciful and lacked a con­nec­tion to rea­lity, e.g.: “Mys­ti­cism and folk­lore (apple in legends): A washing machine deve­lo­ped in col­la­bo­ra­tion with sto­rytel­lers and desi­gners to inte­grate sto­ries and myths into its func­tion and design, so that every wash is a jour­ney to ano­ther world.” We then asked the AI tools to come up with more prac­ti­cal ideas. Tech­ni­cal prac­ti­cal ideas seem to suit them bet­ter, for exam­ple, this was not bad:

“Energy and water effi­ci­ency: using advan­ced sen­sors and algo­rithms, the machine can opti­mize water and energy con­sump­tion based on the cur­rent load, saving resour­ces and redu­cing ope­ra­ting costs.”

As a next step, we again took a real-world pro­ject as a refe­rence, where the AI alre­ady knew the fin­dings. The stra­te­gic approach using crea­tive tech­ni­ques worked well in this case as well. The know­ledge of the fin­dings no lon­ger had a limi­ting effect.

Con­clu­sion: AI pro­ves to be a good par­ti­ci­pant for crea­tion work­shops. It does not need to be “crea­tively des­troyed“, but is wil­ling to par­ti­ci­pate imme­dia­tely. Howe­ver, the crea­tion stra­tegy, in which steps it should pro­ceed, must be given to it. The know­ledge of rese­arch fin­dings then does not have a limi­ting effect and not only con­ven­tio­nal ideas are crea­ted, but quite fan­ciful ones. Regar­ding the Nimmo tech­ni­que, ChatGPT could­n’t help but remark: “Inte­res­t­ing chall­enge! So we use the advan­ta­ges of the apple as inspi­ra­tion to deve­lop a bet­ter washing machine or the laun­dry washing pro­cess”, inclu­ding a wink smi­ley — here it simu­la­tes again, of course, because a soft­ware can of course not find any­thing interesting.

AI as designer

The AI alre­ady knew the study insights about the health vaca­tion and was sup­po­sed to deve­lop an image idea on this basis. It also made a genuine effort to visua­lize the les­sons from the study with, for exam­ple, the result, which is alre­ady shortened: 

“A per­son (per­haps in back view, so that the tar­get group can iden­tify with the per­son) is lying on a com­for­ta­ble loun­ger sur­roun­ded by soft cushions and a drink or a book could lie next to him, sym­bo­lic of rela­xa­tion and “let­ting yours­elf be pam­pe­red.” Next to the loun­ger could be a small table with infor­ma­tion bro­chu­res or books on various “alter­na­ti­ves for a dif­fe­rent ever­y­day life”. In the cen­ter field: a path or way lea­ding away from the loun­ger to a hid­den, only half-visi­ble door or gate — this sym­bo­li­zes the pos­si­bi­lity of diving into the depths of one’s own self. In the back­ground, an atmo­sphe­ri­cally dense and haun­ting land­scape. Con­ceiva­ble would be a forest enve­lo­ped by a gentle mist, with tall, old trees, a mys­te­riously shim­me­ring lake and a few sil­hou­et­tes of peo­ple implied , which empha­si­zes “being alone or in a small community.”

In theory ever­y­thing is included, ever­y­thing has been thought of, but not only Mid­jour­ney was com­ple­tely overs­trai­ned with the rea­liza­tion, but every human desi­gner would have been — and also the viewer, because with all the things that are sup­po­sed to be shown here on one pic­ture, it’s more like a hid­den object chall­enge. In addi­tion, the cushions do not fit atmo­sphe­ri­cally to the misty, mys­te­rious forest and a mys­ti­cal gate into the under­world of the soul in the middle of the land­scape can­not be united by the look and feel at all with a pile of infor­ma­tion brochures.

Other image ideas and the sub­se­quent exe­cu­ti­ons by Mid­jour­ney tur­ned out less crow­ded and incon­sis­tent, but they were rather ste­reo­ty­pi­cal and not very crea­tive and ori­gi­nal — just as they can alre­ady be seen in a thousand ads of health resorts. Or they did­n’t quite fit the needs, espe­ci­ally when rather sub­li­mi­nal image mes­sa­ges were meant to only lightly hint at some­thing or por­tray subtle nuan­ces, as in the case of an expec­tantly joyful yet hesi­tant facial expression.

By the way, the task of deve­lo­ping ideas for our car­toon “Wiss­bert, the mar­ket rese­ar­cher”, which is published in the mar­ke­ting maga­zine “Pla­nung & Ana­lyse“, was simi­larly dif­fi­cult for the AI: As unin­ten­tio­nally funny the seriously meant meta­pho­ri­cal descrip­ti­ons were, which ChatGPT pre­sen­ted as a “test per­son”, as unfunny were the ideas for sto­ries that were sup­po­sed to be humo­rous: They were, at best, try­ing to be funny, and often in such a way that the scene could not even be con­ver­ted into a car­toon image.

Con­clu­sion: In the deve­lo­p­ment of design ideas, the same defi­cit as with meta­phors is again appa­rent. The AI lacks the sen­sory ima­gi­na­tion. It basi­cally puz­zles tog­e­ther cor­rectly unders­tood infor­ma­tion wit­hout a fee­ling for aes­the­tic cohe­rence, wit­hout having any idea of how the whole thing fits on just one image — and there is also the logo and other pic­ture ele­ments that are also com­pe­ting for attention.

At most Mid­jour­ney and Fire­fly offer sup­port as a kind of ‘assistant gra­phic artist’ for the rea­liza­tion of an (own) pic­ture idea. At pre­sent, howe­ver, it is mostly still neces­sary to edit the pic­tures gene­ra­ted by the AI in Pho­to­shop — or to have it gene­rate the indi­vi­dual pic­ture ele­ments such as prot­ago­nists, back­ground or objects and to com­pose them into a coher­ent pic­ture in Pho­to­shop its­elf. It remains to be seen whe­ther future gene­ra­ti­ons of image-gene­ra­ting AI, such as Dall E 3, which was announ­ced at the time of publi­ca­tion of this article, will be bet­ter able to gene­rate even more com­plex com­po­si­ti­ons via text prompt. As is well known, a pic­ture says (and needs) more than 1000 words.

Humans and AI as a Win/Win Team

The little expe­ri­ment actually gave us more cla­rity. In sum­mary, the result is even simple. 

Posi­tive — and we will use this more often in a spe­ci­fic way in the future:

👍 Rese­arch and brain­stor­ming
The large pool of know­ledge and the AI’s ability to quickly con­nect and prepare the infor­ma­tion in a requi­red man­ner can sup­port preli­mi­nary inter­net search — as long as it is not a com­ple­tely new pro­duct. One should also remain skep­ti­cal about whe­ther the info is cor­rect because the sources are unknown. Preli­mi­nary inter­net search and topi­cal brain­stor­ming help to create the Gui­de­line for con­ver­sa­ti­ons with real people.

👍 Fan­ciful coll­ec­tion of ideas
The AI also pro­ves to be a valuable par­ti­ci­pant for a crea­tion work­shop. It is open to any task, no mat­ter how unu­sual, in the deve­lo­p­ment of ideas and can also pro­duce very fan­ciful crea­ti­ons. Howe­ver, you have to give her a well thought-out stra­tegy to eli­cit her exper­tise in coming up with unu­sual yet study insight based ideas.

👍 Image gene­ra­tion fol­lo­wing your ins­truc­tions
The same applies to gene­ra­ting your own (i.e., human) image ideas. Espe­ci­ally in Design Guide explo­ra­tion, you need a variety of dif­fe­rent image motifs as test and trig­ger mate­rial. AI makes you a little less depen­dent on stock pho­tos, saves you some time and allows you to rea­lize one or the other image idea that would other­wise be much more time-con­sum­ing if you were to assem­ble it com­ple­tely yours­elf in Pho­to­shop. Howe­ver, it can no more take over the sys­te­ma­tic sel­ec­tion of pic­tures on the basis of the rese­arch fin­dings than it can take over the deve­lo­p­ment of the image ideas themselves.

Limits show up cle­arly with all abili­ties that require expe­ri­en­cing, phy­si­cal and sen­sual (life) expe­ri­ence and inten­tio­na­lity. Here one should be cau­tious and not fall for the (partly good) simu­la­tion of the tools:

👎 Under­stan­ding
Human input is essen­tial if you really want to know — and that’s what we want in psy­cho­lo­gi­cal rese­arch — how peo­ple expe­ri­ence, ima­gine and feel some­thing, and if you want to fun­da­men­tally and deeply under­stand why peo­ple behave the way they do. The defi­cit that AI only simu­la­tes human cha­rac­te­ristics in the end is par­ti­cu­larly evi­dent in crea­ting design ideas, but like­wise in the lack of rea­lity in crea­ting pro­duct ideas: a washing machine made of bam­boo may meet the cri­te­ria of sus­taina­bi­lity, but it cer­tainly fails due to some laws of physics.

👎 Cohe­rence
The AI also does­n’t know whe­ther some­thing is coher­ent, e.g., if sub­jects express them­sel­ves in a somehow strange way and if this is a hint to go deeper in order to get to the bot­tom of what is actually behind a ver­ba­li­zed need. It does not know what makes a good idea and does not reco­gnize whe­ther the idea is actually a solu­tion to a pro­blem. Here it also lacks important human qua­li­ties: a vision, the vague idea of an inge­nious solu­tion, and the eupho­ri­cally urgent effort to seek it.

👎 Deve­lo­p­ment
The AI has no inten­tion to find a good solu­tion. It lacks the joy and pride of having jointly gro­wing bey­ond your limits as a team, fol­lo­wed by the pas­sion to keep at it and push the idea fur­ther. Do you want to hand that over to the AI? It does­n’t make much sense eit­her, because you won’t get far with a coll­ec­tion of ideas. There is still a long way to go before you have a good and coher­ent inno­va­tion that also evo­kes enthu­si­asm among con­su­mers. But AI can be a valuable com­pa­n­ion in the pro­cess, pro­vi­ding inspi­ra­tion, hel­ping to over­come bar­riers to thin­king, and acce­le­ra­ting the pro­cess wit­hout com­pro­mi­sing on quality.

Addi­tio­nal infor­ma­tion about the test setting

For the gene­ra­tion of texts ChatGPT based on GPT3.5 and GPT4 (mainly) of the com­pany Ope­nAI was used

The prompts for ChatGPT (writ­ten text ins­truc­tions) were sys­te­ma­ti­cally varied in terms of ins­truc­tion, amount of back­ground info given, and “tem­pe­ra­ture”: “tem­pe­ra­ture” (values bet­ween 0 and 1) influen­ces the “emo­tio­na­lity” of the response

For image gene­ra­tion, Mid­jour­ney ver­sion 5.2 and Adobe Fire­fly were used: Adobe Fire­fly has recently become part of Adobe Pho­to­shop

Neu­ro­flash was also used for both text and image gene­ra­tion: Neu­ro­flash is a Ger­man-lan­guage AI-powered soft­ware for auto­ma­tic text and image gene­ra­tion that offers addi­tio­nal fea­tures such as auto­ma­ted crea­tion of social media posts

The refe­rence pro­jects used for com­pa­ri­son were cur­rent pro­jects from the media, tele­com­mu­ni­ca­ti­ons and tou­rism sec­tors

The expe­ri­ment was con­duc­ted in Sep­tem­ber 2023

Dis­clai­mer

This text was writ­ten in Ger­man wit­hout the help of AI. The trans­la­tion into Eng­lish was done auto­ma­ti­cally with the help of AI and not che­cked by native spea­k­ers. The cover image (col­lage) may con­tain small traces of AI (and a lot of hand work in Pho­to­shop). The chap­ter divi­ders are 100% human made and free of syn­the­tic ingredients.

Share Post :

weitere Beiträge