Creating an AI-powered SaaS tool might sound daunting, but with the help of open-source projects available on GitHub, it becomes a more achievable task. In this article, I’ll share how I developed Ai Photocraft, a versatile AI tool capable of face swapping, cartoonizing, background removing, image upscaling, and image enhancing, using existing open-source projects.
Before diving into the development, I identified the need for a comprehensive image editing tool that could leverage the power of AI to perform advanced tasks. The goal was to create a user-friendly platform where even non-technical users could apply these complex transformations to their images with ease.
The open-source community offers a wealth of resources that can be adapted and integrated into various projects. Here are the key components and GitHub repositories that played a significant role in developing Ai Photocraft:
AI Face Swapping: For AI face swapping, I utilized a project that implements deep learning models capable of swapping faces with high accuracy and realistic results. We make use of InsightFace library and ONNX model to swap faces between two images or even swap faces within the same image with the image enhancers. The good thing about face swapping is we are receiving around 50 to 60 clicks daily from search engine in our multi face swap tool called couple face swapper which is majority compared to other pages in the site.
InsightFace Github Repository: https://github.com/deepinsight/insightface
ONNX Model Repository: https://github.com/onnx/models
Cartoonizing Images: To convert photos into cartoon-style images employed DCT-Net: Domain-Calibrated Translation for Portrait Stylization, an advanced deep learning model specifically designed for portrait stylization. DCT-Net is adept at creating high-quality cartoon-style images while maintaining the essence and details of the original portraits. This model uses domain calibration techniques to adapt the style translation process, resulting in images that are both visually appealing and true to the subject.
DCT Net Project Page: https://menyifang.github.io/projects/DCTNet/DCTNet.html
Background Removal: For background removal, I utilized MODNet (Matting Objective Decomposition Network), a deep learning approach that excels at foreground segmentation and background replacement. MODNet is particularly effective in accurately distinguishing between the subject and the background, allowing for precise background removal and the ability to add new backgrounds seamlessly. This model’s lightweight and real-time capabilities make it ideal for delivering quick results without compromising on quality, enhancing the user experience significantly.
Image Upscaling and Enhancing: Improving image resolution and quality was made possible by using super-resolution models found in open-source repositories. These models employ deep learning to upscale images while maintaining clarity and adding details that are often lost in low-resolution pictures.
To enhance and upscale images, I utilized two powerful GAN-based models. For image enhancing, I employed GFPGAN (Generative Facial Prior-Generative Adversarial Network), which excels in restoring facial details and improving overall image quality. For image upscaling, I used ESRGAN (Enhanced Super-Resolution Generative Adversarial Network), which is designed to upscale images by increasing their resolution while preserving and enhancing fine details. ESRGAN’s ability to generate high-resolution outputs makes it an ideal choice for upscaling images without losing clarity or introducing artifacts.
GFPGAN Repository: https://github.com/tuttlebr/gfpgan
ESRGAN Repository: https://github.com/xinntao/Real-ESRGAN
Text to Image: To add something trending in our project we have used vertex ai to make use of imagen model to generate realistic images from text to image. In addition to basic prompting we have configured the server to work with different style of image like aesthetic, realistic, cartoon, cinematic, robotic etc.
Integration and Development
Building Ai Photocraft was not just about finding the right tools; it was about weaving them into a seamless SaaS platform that delivers a great user experience. Here’s how we brought it all together:
Building the Backend: I took on the task of creating the backend using Flask. This is where the AI magic happens—processing requests, managing images, and ensuring the right models are called and results delivered in real-time. To handle the intensive computational requirements, we deployed the backend on servers equipped with GPUs from TensorDock. This setup allowed us to process images as quickly as possible, providing a smooth and efficient service.
Frontend Development: For the frontend, I teamed up with my friend, who brought expertise in Next.js to the table. Together, we crafted a user-friendly interface that makes it simple for users to upload images, select actions, and see results almost instantly. We aimed to keep the process intuitive, ensuring that even those without technical know-how could use Ai Photocraft with ease.
Handling Backend Services with Firebase: To streamline backend services like authentication, database management, and other functionalities, we incorporated Firebase. This not only sped up the development process but also provided a robust and scalable solution for managing our user data and sessions.
Scalability and Deployment: To prepare for a growing user base, we utilized cloud platforms with auto-scaling capabilities. The entire application was containerized using Docker, and Kubernetes helped us manage deployment across multiple environments. We set up Nginx as a reverse proxy server and used Gunicorn to run the Flask application efficiently.
Payments and Subscriptions: To keep our servers running we have decided to implement subscription for the users using more than 10 credits daily, We have integrated Paddle (Merchant of Records) platform to handle subscription payments, paying taxes and maintaing legal compliances all over the globe.
Bringing all these elements together was a challenging but rewarding process, transforming Ai Photocraft from a concept into a powerful AI-driven tool that anyone can use for free.
Challenges and Learning
Developing Ai Photocraft came with its fair share of challenges. Integrating various open-source models demanded careful adjustments and occasional retraining to align with the tool’s specific needs. Ensuring these models were optimized for both speed and accuracy was critical to providing a high-quality user experience.
Throughout this journey, I’ve learned an immense amount—ranging from front-end development and working with sophisticated machine learning models to deploying applications in the cloud. Collaborating with friends and seeking advice from mentors enriched the development process, offering invaluable insights and support. It’s been a fantastic experience, and we’re continuing to refine and train our pre-trained models to enhance the quality of results even further.
Conclusion
Creating Ai Photocraft has been a profoundly rewarding experience, showcasing the power and potential of open-source software. By leveraging technologies and resources available on GitHub, I was able to build a sophisticated AI tool without reinventing the wheel. Ai Photocraft stands as a testament to how developers can create impactful SaaS solutions through collaboration and innovation. Your feedback is appreciated as we continue to improve and evolve Ai Photocraft.