Coexistence of machine intelligence, cyber art, and diagnostics: is it possible?

Cover Image


Cite item

Abstract

The development of machine intelligence and the application of generative images created using it is a promising area of communication design and human–machine interaction. This letter to the editor represents the author’s vision of the use of generative images for diagnosing human conditions.

The use of machine intelligence as an interactive and intelligent diagnostic tool will allow a psychologist and a physician to effectively complement the therapeutic processes of controlled interactions of their users.

Libraries of models and sets of applications with text-to-image algorithms are already available that can be used by engineers and designers in the process of creating objects of modern digital art. They can also be applied in the investigation of new paradigms using visual communications and their application in experimental diagnostics.

Full Text

VISUAL PERCEPTION OF IMAGES

Machine learning (ML) is widely used in diagnostic settings to address concerns with pathology classification, search, and visualization, including the diagnostics of Alzheimer’s disease, one of the most studied issues in terms of the standpoint of publication activity [1, 2]. Artificial neural networks and generative models for creating visual content, text2image, are currently being developed along with the use of ML algorithms (in particular, the support vector machine) and the expansion of the diagnostician’s tools. The text2image model is an algorithm that enables to generate an image based on a text query.

In contemporary culture, the perception of visual images, such as artistic images, is directly related to emotional and cognitive processes, personal characteristics of their perception, and interpretation by each person individually. In fact, how we perceive things, like abstraction (Fig. 1) can tell us a lot about ourselves. In their work [3], M.F. Koich and F. Pessotto showed that the distortion of the emotional perception of images is associated with individual personality traits. According to the authors’ study, the feeling of joy when presented with certain images correlated with sociability, and the feeling of fear correlated with the ability to resist aggression and defend personal boundaries.

 

Fig. 1. Images generated by a neural network.

 

A promising technology for broadcasting artistic content is virtual reality technology, where the user (patient) creates his own reality, “transitional” between the inner world and external reality, which can be explored in cooperation with a psychologist or doctor [4]. Due to the technology of virtual reality, researchers have acquired new tools with unique capabilities. For example, F. Paladines-Jaramillo et al. [5] adapted the Rosenzweig Picture-Frustration test, for which the stimulus material with pictures of various situations was transferred to a virtual environment.

Technologies, such as general-purpose artificial intelligence, will eventually be able to become a naturalistic part of the therapeutic processes into which they are integrated. Definitely, research and development of special therapeutic applications and systems are required for adaptation and mass introduction into practice.

MACHINE INTELLIGENCE

The possibilities of machine intelligence are expanding rapidly, keeping up with advances in virtual reality technology. Over the past year, we have seen stunning creations of generative digital art objects1, design objects, photorealistic paintings, pictorial images using generative adversarial networks (GANs) and diffusion models (DM), such as DALL-E 2, Imagen, ruDALL-E, VQGAN, Stable Diffusion, Latent Diffusion, Disco Diffusion, and so forth. These models operate on a principle of converting the input text into an image.

The result of the joint interaction of a person developing an algorithm and entering a text query, and GAN (or DM) is already an additional creative effect [6]. Here, the computational result of the text2image model is a digital object, a two-dimensional image.

An interesting fact is that GAN-like models are used to analyze neuroimaging data (computed tomography or magnetic resonance imaging2) [7, 8].

Machine intelligence has a perfect command of the text; with the current state-of-the-art of artificial intelligence technologies, the ability to predict the next element of the text is important for understanding its meaning and creating new meaningful texts. It is true that the algorithms for creating visual images also use the “prediction of the next pixel”. However, unlike text models (GTP-3, etc.) and the text phrases generated by them, during dialog interaction, synchronization between people occurs at the level of neuropsychological functions [9], for example, it increases with the involvement of the general emotional field [10]. The positive effects of this neural synchronization are applied in communication experiments [11, 12].

VISUAL PERCEPTION AND EMOTIONS

Developers are constantly striving to improve the functionality and performance of neural networks (applications DALL-E 2, ruDALL-E, Stable Diffusion, Midjourney, etc.), and their appearance inspires scientists to analyze the visual perception of the meanings inherent in artistic objects using generative art3 [6, 13]. In this regard, a logical question arises whether the perception of digital art objects is related to the personal characteristics of the beholder. In particular, P. Achlioptas et al. [14] conducted a study of the emotions that accompany the visual perception of pieces of art and the explanations of their own emotions associated with them. Visual art were used in this experiment as stimulus material to evoke a strong emotional response. As the authors of [14] emphasized, the affective component is often underestimated when developing artificial intelligence systems.

Let us conduct a small experiment by answering the following question: “Which of the two images presented in Fig. 2 (а, b), was created by a neural network, in your opinion?”

 

Fig. 2. Images (a, b) generated by the neural network.

 

The answer is simple. These two images (Fig. 2) were created using artificial intelligence [15].

Due to the development of text2image generative models, it seems possible to quickly create a thematic series of unique digital images using a neural network. Even now, almost any researcher can use such tools, generate new contextual images, and plan their own design of the experiment.

It is ecologically valid to use visual arts as a stimulus for organizing research. A person in his reactions has many experiences, including emotions and self-reflection. This experience is highly individual, and the reactions of people looking at the same object vary significantly. These individual differences are confirmed by the patterns of neuronal activity in different subnetworks of the brain [16].

CONCLUSION

In order to positively answering the question posed in the letter, “Is it possible to use machine intelligence to create generative images and apply it in experimental diagnostics?”, it is necessary to focus on the fact that the development of research at the intersection of psychology and generative art, where machine intelligence creates full-fledged artworks, contributes to the emergence of intelligent systems that support the emotional human-machine interaction. Such systems will then be integrated into robots that, in the role of a social partner, will help a person manage adaptively and regulate his own emotions, and, in the role of a medical assistant, will organize therapeutic activities.

Such an approach will be implemented not only as an interactive and intelligent tool of a psychologist and a doctor, for example, for the purposes of experimental diagnostics of affective processes in patients, but as a more complex system4 that provides a controlled interaction between the doctor-machine intelligence and the patient for the purposes of practical medicine.

ADDITIONAL INFORMATION

Funding source. This article was not supported by any external sources of funding.

Competing interests. The author declare that he has no competing interests.

Author’s contribution. The author made a substantial contribution to the conception of the work, acquisition, analysis, interpretation of data for the work, drafting and revising the work, final approval of the version to be published and agree to be accountable for all aspects of the work.

 

1 For example, DALL-E 2 OpenAI (access mode: https://openai.com/dall-e-2); ruDALL-E (Dalle) Sber, SberDevices (access mode: https://rudalle.ru).

2 For more details, see reviews about the role of generative adversarial neural networks in the analysis of medical images [6, 13].

3 Generative art refers to art objects created using information technology, in particular GAN or DM.

4 Medical product.

×

About the authors

Andrey V. Vlasov

Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies; Izmerov Research Institute of Occupational Health

Author for correspondence.
Email: a.vlasov@npcmr.ru
ORCID iD: 0000-0001-9227-1892
SPIN-code: 3378-8650
Russian Federation, Moscow; Moscow

References

  1. Tanveer M, Richhariya B, Khan RU, et al. Machine learning techniques for the diagnosis of alzheimer’s disease: a review. ACM Transactions Multimedia Computing Communications Applications. 2020;16(1):35. doi: 10.1145/3344998
  2. Sharma S, Mandal PK. A comprehensive report on machine learning-based early detection of alzheimer’s disease using multi-modal neuroimaging data. ACM Computing Surveys. 2023;55(2):1–44. doi: 10.1145/3492865
  3. Koich MF, Pessotto F. Projective aspects on cognitive performance: distortions in emotional perception correlate with personality. Psicologia Reflexão Crítica. 2016;29(17):1–8. doi: 10.1186/s41155-016-0036-6
  4. Adaskina AA. Therapeutic possibilities of digital artistic creativity. Modern Foreign Psychology. 2021;10(4):107–116. (In Russ). doi: 10.17759/jmfp.2021100410
  5. Paladines-Jaramillo F, Egas-Reyes V, Ordonez-Camacho D, et al. Using virtual reality to detect, assess, and treat frustration. In: Morales R.G., Fonseca C., Salgado E.R., et al. (eds.) Information and communication technologies. TICEC 2020. Vol. 1307. Communications in Computer and Information Science. Springer, Cham, 2020. doi: 10.1007/978-3-030-62833-8_28
  6. Cetinic E, She J. Understanding and creating art with ai: review and outlook. ACM Trans Multimedia Comput Commun Applications. 2022;18(2):1–22. doi: 10.1145/3475799
  7. AlAmir M, AlGhamdi M. The Role of generative adversarial network in medical image analysis: an in-depth survey. ACM Computing Surveys. 2022. doi: 10.1145/3527849
  8. Ali H, Biswas R, Ali F, et al. The role of generative adversarial networks in brain MRI: a scoping review. Insights Into Imaging. 2022;13(8):1–15. doi: 10.1186/s13244-022-01237-0
  9. Lankinen K, Saari J, R Hari, et al. 2014. Intersubject consistency of cortical MEG signals during movie viewing. NeuroImage. 2014;92:217–224. doi: 10.1016/j.neuroimage.2014.02.004
  10. Nummenmaa L, Glerean E, Viinikainen M, et al. Emotions promote social interaction by synchronizing brain activity across individuals. Proceedings Nat Academy Sci. 2012;109(24):9599–9604. doi: 10.1073/pnas.120609510
  11. Tseng PH, Rajangam S, Lehew G, et al. Interbrain cortical synchronization encodes multiple aspects of social interactions in monkey pairs. Sci Rep. 2018;8(1):4699. doi: 10.1038/s41598-018-22679-x
  12. Shanechi MM. Brain-machine interfaces from motor to mood. Nat Neurosci. 2019;22(10):1554–1564. doi: 10.1038/s41593-019-0488-y
  13. Vlasov A. GALA Inspired by Neo Klimt: 2D images processing with implementation for interaction and perception studies (preprint). 2022. doi: 10.13140/RG.2.2.10806.57928
  14. Achlioptas P, Ovsjanikov M, Haydarov K, et al. ArtEmis: affective language for visual art. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), October 6, 2021:11569–11579. doi: 10.48550/arXiv.2101.07396
  15. Gala Klimt. Digital art collection of pictorial poems. Ridero. 2022. Available from: https://www.researchgate.net/project/GALA-KLIMT. Accessed: 15.08.2022.
  16. Vessel EA, Starr GG, Rubin N. The brain on art: intense aesthetic experience activates the default mode network. Front Hum Neurosci. 2012;6:66. doi: 10.3389/fnhum.2012.00066

Supplementary files

Supplementary Files
Action
1. JATS XML
2. Fig. 1. Images generated by a neural network.

Download (847KB)
3. Fig. 2. Images (a, b) generated by the neural network.

Download (792KB)

Copyright (c) 2022 Eco-vector

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

СМИ зарегистрировано Федеральной службой по надзору в сфере связи, информационных технологий и массовых коммуникаций (Роскомнадзор).
Регистрационный номер и дата принятия решения о регистрации СМИ: серия ПИ № ФС 77 - 79539 от 09 ноября 2020 г.


This website uses cookies

You consent to our cookies if you continue to use our website.

About Cookies