Showcasing real-world applications through AI proof of concepts
Description
Detects tables in an image and returns the precise coordinate points of the detected table, while accurately extracting and redrawing the entire table to preserve its structure. The model identifies the coordinates of each individual cell and performs Optical Character Recognition (OCR) on each cell separately to capture the data effectively.
Tools/Technologies
Description
Detects tables in an image and returns the precise coordinate points of the detected table, while accurately extracting and redrawing the entire table to preserve its structure. The model identifies the coordinates of each individual cell and performs Optical Character Recognition (OCR) on each cell separately to capture the data effectively.
Tools/Technologies
Description
Processes a PDF document as input and uses Retrieval-Augmented Generation (RAG) to answer queries related to the content of the uploaded PDF. It converts the PDF into embedding chunks using an embedding model. When a query is made, the model retrieves the relevant chunks from the embedded data and generates an accurate answer based on the retrieved information. This solution is tailored for healthcare-related PDFs, such as medical reports, clinical guidelines, or patient records.
Tools/Technologies
Description
Processes a PDF payslip as input and utilizes Retrieval-Augmented Generation (RAG) to answer queries related to the content of the uploaded payslip. It converts the PDF into embedding chunks using an embedding model. When a query is made, the model retrieves relevant chunks from the embedded data and generates accurate answers based on the retrieved information. This solution is specifically tailored for payslip-related PDFs, enabling users to gain insights into their earnings, deductions, and other relevant details.
Tools/Technologies
Description
It processes PDF invoices to efficiently respond to user queries about their content. By converting invoices into embedding chunks using an embedding model, it leverages Retrieval-Augmented Generation (RAG) to extract and provide precise answers. When a query is submitted, the model retrieves relevant information from the embedded chunks, allowing users to gain insights into billing details, payment statuses, itemized charges, and other crucial information found within the invoice. This tailored solution streamlines invoice management and enhances financial data accessibility.
Tools/Technologies
Description
Finds QR and Bar codes in images and extracts them, then scans to give their URLs. Includes tabs for QR, Bar code, and OCR for extracting card details.
Tools/Technologies
Description
Converts text queries to SQL, enabling users to interact with databases using natural language instead of writing SQL code. Fine-tuning defog/sqlcoder-7b-2 model for shopify store data.
Tools/Technologies
Description
An AI agent chat interface that creates, deletes and lists events within a particular time limit. It checks if a person is free or busy, and lists available free schedules on Google Calendar using API calls.
Tools/Technologies
Description
This is an interactive voice-to-voice system that allows users to engage in natural conversations. The system captures spoken input through microphone, transcribes it into text using a speech-to-text model (OpenAI's Whisper). This text is then processed by an LLM(llama-3.1-70b-versatile) to generate a contextually appropriate response. Finally, the response is converted back into speech using a text-to-speech model(facebook/mms-tts-eng), enabling verbal communication with the system.
Tools/Technologies
Description
This is an interactive voice call conversational AI system designed to confirm basic candidate details and conduct pre-interview calls. The system utilizes Twilio to initiate calls and manages the conversation using the Groq 'llama-3.1-70b-versatile' model. Twilio converts user speech to text, which is then sent to the LLM to generate a response. The response is relayed back to Twilio, which plays it during the call, facilitating a natural conversation. The LLM is specifically prompted to emulate an HR representative and ask relevant questions.
Tools/Technologies
Description
Real-time speech recognition converts spoken language into text instantaneously, enabling fast, accurate voice-to-text applications.
Tools/Technologies