The platform, which is powered by AI and Blockchain
Generating a user story
The Dememoriam platform will use state-of-the-art deep learning technology, such as transformer networks, to generate unique stories for each user based on their inputs. The natural language AI model will be used to take user input and make responses that make sense and use natural language.
At the moment, the Dememoriam platform will make use of the most comprehensive AI language model available in the market: OpenAI’s text-davinci-003. The machine learning model is trained on a massive corpus of text data obtained from the web, with a total of over 345 million parameters. During the training process, the model was exposed to diverse text data, including news articles, fiction, and conversational data, allowing it to learn the patterns and structure of natural language. The training was performed using a process called unsupervised learning, where the model was able to learn patterns in the data without explicit supervision or labeling. This training method allows it to understand the context and structure of the user inputs, and produce stories that are relevant and personalized for each user. The Dememoriam AI team is currently working on leveraging the modern training techniques and machine learning algorithms described above to develop our own language models. Dememoriam’s models will be fine-tuned for a single task, namely, generating user stories; thus, their performance in this task will be superior to any existing state-of-the-art machine learning algorithms.
The stories generated by modern AI technology are also capable of incorporating elements of storytelling, such as narrative structure, character development, and dialogue, further enhancing the overall user experience. Integrating the GPT-3 natural language model into our platform allows users to gain a deeper, more personal connection with the content generated.
Creating the user avatar image
In addition to the AI technology used for processing natural language, the Dememoriam platform will also leverage the power of machine learning models that have image processing capabilities. Avatar generation machine learning algorithms will be used to provide users with a unique visual representation of themselves. The model takes as input a photo of the user and generates 50 variations of him or her as different versions of themselves in the form of images. The unique technology behind avatar creation not only uses the input photo of the user to generate these avatars but also takes into account the story generated by the machine learning language model. This integration allows our web platform to create a truly personalized experience for each user, as the avatars generated are not only based on their physical appearance but also on the unique aspects of their personality and experiences as captured in the story generated by the GPT-3 language model.
The avatar generation technology utilizes state-of-the-art deep learning algorithms, such as generative adversarial networks (GANs), to generate high-quality, visually distinct avatars. A GAN consists of two parts: a generator and a discriminator. The generator generates images based on a random noise vector, and the discriminator determines whether the generated image is real or fake. During training, the generator and discriminator are trained in a competition-like manner, where the generator tries to generate images that are indistinguishable from real images and the discriminator tries to correctly identify whether an image is real or generated. Over time, the generator learns to generate images that are increasingly realistic, while the discriminator becomes better at correctly identifying the generated images. The training process involves exposing the model to a large number of images, allowing it to learn the patterns and features that are present in real images. Training the algorithm on a diverse dataset of images enables it to generate avatars that are both realistic and diverse. While Dememoriam’s image processing AI algorithms are in their early stages, our development team is working on a collaboration with Prisma Labs, one of the most successful machine learning companies specializing in image processing.
This combination of advanced algorithms and large-scale training data enables the technology to generate avatars that are not only visually appealing but also highly personalized for each user.
Creating a voice for the avatar
The Dememoriam platform will take advantage of yet another deep learning tool, the text-to-speech model, which will be used to generate a realistic voice for the AI-generated avatar images. The text-to-speech technology leverages machine learning and artificial intelligence to clone and generate realistic-sounding voices. The algorithm is able to add an infinite amount of emotions to the voice without any additional data, making it easy to customize the tone and mood of the voice to match the story generated by the machine learning language model. Additionally, we are able to convert the voice into any language, making it easy to provide a customized voice experience for users all over the world. The platform supports 10 or more of the most popular languages, including English, French, German, Dutch, Spanish, and Italian.
These text-to-speech models, typically based on recurrent neural networks (RNNs) and transformers, have an encoder-decoder architecture that processes the input text and generates the speech output. The training process exposes the models to the audio data and fine-tunes them to reduce the gap between the synthesized speech and real human speech. Clusters of NVIDIA T4 GPUs were used with Amazon Elastic Kubernetes Service (Amazon EKS) to train and run the speech models.
Data augmentation and other techniques are used to improve the accuracy and generalization ability of the models. Once the models have been trained, they are put into production by using the NVIDIA Triton Inference Server, which is fast and reliable.
Finally, the AI technology is able to take real voice recordings and sprinkle in synthetic content for a seamless experience. This allows the Dememoriam platform to replace, add, or remove any speech from an avatar seamlessly, providing users with a truly unique and personalized voice experience. The versatility and customization options offered by the machine learning programming interface make it a valuable tool for our platform to generate realistic voices for the users' avatars.
Converting the avatar image into a video
Dememoriam will also integrate an AI tool that leverages deep learning technology to turn images of faces into high-quality, photorealistic videos.
The model behind this tool is trained on tens of thousands of videos and images, allowing it to capture the patterns and correlations between them and generate realistic video outputs. The training data is a combination of real and synthetic data and often includes a diverse range of poses, expressions, and scenarios to capture the variety of movements and appearances in real-world videos. The architecture of the model consists of two main components: an image-to-video generator and a video discriminator, which work together to produce photorealistic videos from still images. The model is trained using deep learning techniques, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs), and is optimized for efficient deployment using GPUs.
The AI technology allows users to turn their avatar images, generated by the other AI model, into videos with accompanying audio, creating a seamless and expressive experience.
Optional verification step (KYC)
To ensure the authenticity of user-uploaded photos, our platform will employ AI models that specialize in image validation and facial feature extraction. The models will analyze each uploaded image to confirm its authenticity and check that it has not been digitally manipulated or altered. The AI will then extract key facial features from the image and match those features with the facial features extracted from the user's government-issued ID during the "Know Your Customer" (KYC) step. This multi-step process will provide an additional layer of security to ensure that the "verified" badge on our platform is only granted to real and legitimate users. The models used for this process will be trained on large datasets of real images, and will use deep learning techniques, such as Convolutional Neural Networks (CNNs), to accurately and efficiently validate images and extract facial features.
Converting your speaking avatar into NFT
The process of creating a nonfungible token is called minting. The term refers to the process of turning a digital item into an asset on the blockchain. As soon as the user starts the minting process, the Dememoriam platform generates a smart contract along with the token standard and metadata for non-fungible tokens. This combination results in a digital asset being stored on the blockchain. Each user will have a digital cryptocurrency wallet integrated into the platform; with it, users can display, upload, and transfer their NFTs through the platform seamlessly.
NFT Metadata
A non-fungible token is not just one thing-it’s a combination. This combination involves two key components: the token standard itself and the digital content, which is a speaking avatar media file. This file-the content and flavor of the NFT-is known as the metadata. The token standard is the back-end part of NFTs, allowing assets to be "owned" by attaching them to the blockchain. Frequently, a token does not contain any artwork, video, or audio files-probably the owner's name and artwork description. The content of your non-fungible token, be it an MP4 video file, is generally stored off-chain. In this case, Dememorium leverages two data storage methods, which support off-chain file storage and on-chain file storage, in order to create an immersive user experience. As a result, Dememoriam platform users can easily display their digital assets in both cases: blockchain explorer and Dememoriam's platform. In addition, the Dememoriam team has plans to develop its first blockchain for an even better user experience and the product's scalability.
NFT Storage (on-chain)
Generally speaking, blockchain is not efficient enough to handle the high volumes of data traffic. The Dememoriam development team researched new options for token metadata storage on chain. Assistance from numerous network models will enable the Dememoriam NFT metadata link with Blockchain Explorer. Until now, the Dememoriam platform will be one of the standalone platforms where users can display their assets on-chain, which will result in a revolutionary way of expressing yourself in a decentralized world.
NFT Storage (off-chain)
In order to achieve an immersive experience, the Dememoriam platform will leverage off-chain data storage as well, and it will allow users to take actions with their digital assets in a matter of seconds. There is also a decentralized alternative to using central servers while enabling off-chain data storage. IPFS (or InterPlanetary File System) is a distributed storage network that turns computers across the planet into peer-to-peer "storage nodes" by using their free bandwidth to store files, websites, and apps. There is no way of knowing which nodes are supporting a given file or site. The IPFS off-chain storage system is an extremely convenient way to exchange metadata files inside of Demetria's ecosystem.
Intellectual Property & NFT
Intellectual property (IP) is an incredibly useful tool to help Demomriam's users protect their rights and personalize and authenticate their avatars to their full potential, which has never been seen before. It’s a powerful tool, and our development team leverages it heavily; ensuring intellectual property rights are protected and respected should be a paramount consideration in NFT projects. IP rights will be defined in the terms of the smart contract encoded in the NFT, allowing Dememoriam users to effectively control and monetise their IP.
Conclusion
The Dememoriam platform leverages the latest advancements in blockchain, data storage, deep learning and AI technology to offer a truly personalized and secure experience for users. The platform utilizes a state-of-the-art AI language model to generate unique stories for each user, incorporating elements of storytelling, narrative structure, and dialogue. The platform also uses machine learning models for image processing to create unique avatars based on a user's physical appearance and the unique aspects of their personality and experiences captured in the story. Furthermore, the platform uses text-to-speech models to generate realistic voices for the avatars in multiple languages. To add, platform utilizes IPFS storage and Blockchain technology for more content management and storage options. The combination of these cutting-edge technologies results in a highly personalized and immersive experience for users.
Last updated